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in this issue ... 



by Jane-Ellen Long 

Managing Editor 


<jel@usenix.org> 


This issue seems particularly packed 
with good advice. Who isn’t concerned 
about security these days? Phil Cox 
and Tina Darmorhay tell the tale of 
Tina’s quest: how to turn off SMB 
printer and file sharing so that an NT 
machine can run a Web server with 
decent security. A long chase and a 
tough one, but the winner of SAGE’s 


Outstanding Achievement award never faltered. 


When Bruce Mohler needed to automate his collection of system profiles, he turned to 
Perl and the Web. His syssum software gathers an impressive amount of UNIX configu¬ 
ration information, organizes it, and displays it via a Web browser on request. Way cool! 

We’re delighted to announce that Clif Flynt, author of Tcl/Tk for Real Programmers , has 
joined the ranks of ;login: columnists. To learn about Tel and Tk from an expert, turn to 
“The Tclsh Spot.” 

Investigating cryptography? Matt Curtin tells you when not to believe what the vendor 
tells you. Maybe you’ve been wondering about biometric authentication systems. Dario 
Forte has the latest word. Battling over ownership of root? Jim Hickstein takes a some¬ 
what heretical stance: sysadmins should consider letting some users in on the ground 
floor. You’ve probably noticed that there’s more than one way to perform taxidermy on 
a feline, or to password-protect Web pages. As a cat-lover, I’ve censored the article on 
the former, but Dave Taylor gives the pros and cons of various approaches to the latter. 

Spend 10 minutes, save 6 hours a day - of CPU time, that is. USENIX Board President 
Andrew Hume gives a real-life example of performance optimization. Glen McCluskey 
addresses the costs of data formatting. From other columns, learn how to write applets, 
how to help developers write apps your network can handle, how to analyze log files 
using Perl. 

When you’ve taken advantage of all this information, you may start thinking about pol¬ 
ishing up your resume. David Clark has some suggestions on how to make the most of 
your skills and experience. 

Or you may be inspired to share your own hard-won techniques. If you’ve been working 
on the problems associated with UNIX on the desktop, please turn to Rik Farrow’s col¬ 
umn this month. If your expertise lies elsewhere, remember, we’re always looking for 
;login: articles, photographs, book reviews, apt cartoons, and outraged Letters to the 
Editor. Send email to <login@usenix.org>. 
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letters to the editor 


Correction: MRTG-2 Is Still Supported! 

In the February ;login: there was a review 
of my LISA *98 MRTG talk. The article 
says that I am ending support for MRTG 
due to time constraints. This is not the 
case. 

What I said in the talk was that I wanted 
to encourage people to migrate to rrd- 
tool/cricket/mrtg-3 once these programs 
become stable. I am not actively continu¬ 
ing to develop mrtg-2 because I invest 
what time I have in mrtg-3/rrdtool. 

But, as you can see from the mrtg-2 
releases published in the last few weeks, I 
am still maintaining mrtg-2, although I 


am not adding new features but only 
integrating patches and removing bugs. 

Cheers, 

Tobi 

Tobias Oetiker 
<oetiker@ee.ethz.ch> 

Correction: Solaris Certification Is 
Exam-Based 

The column “Taming the Certification 
Beast” [October 1998 ;logiti:] made the 
statement that Solaris certification “relies 
on course attendance rather than testing.” 
This is inaccurate. The exams are based 
closely on Sun Educational Services 


course material, but the certification itself 
is based 100% on exam performance. 
There is no requirement whatsoever that 
a candidate for certification ever have 
attended a Sun course, and indeed there 
are a lot of Sun-certified sysadmins (and 
network admins) out there who did it all 
on their own. 

I am not affiliated with Sun 
Microsystems in any way. 

From the catapult of J.D. Baldwin 

J.D. Baldwin 
<baldwin@netcom.com> 

We stand both cheered and chastened. - 
Managing Editor 



The USENIX Crossword Puzzle 

Across 
1. Splendor 
5. Landing gear sup¬ 
port 

10. Remain 

14. Sword 

15. Charity 

16. Jason’s ship 

17. Represser 

19. Goat tailed deity 

20. Sets up 

21. Lamented 

23. Italian desserts 

24. Racial epithets 

25. Fast Canadian 
compiler 

28. Two continents 

31. Hunter in stars 

32. Mustard plant 

33. Yo ho ho drink 

34. Encircle 

35. Ringlets 

36. Summon to court 

37. Came into contact 

38. Relating to liquid 
concentration 

39. Business lan¬ 
guage 


40. Slipshod 

59. Finnish dude 

18. Radar beacon (ab- 

39. Cuban dance 

42. Tree group 

Down 

br) 

41. Essential parts 

43. Makes coins 

1. Graceful female 

22. Belonging to us 

42. Tilled 

44. Gasp 

2._Source Soft¬ 

24. Aroma 

44. Resolve to gram¬ 

45. Nasty fly 

ware 

25. Maggots 

mar parts 

47. Close to the lower 

3. Army food hall 

26. Uranus moon 

45. Periodic water 

limit 

4. Cavil 

27. Color measuring 

movement 

51. Ninth greek letter 

5. Climber 

device 

46. Tart 

52. Whale oil for can¬ 

6. Not heads 

28. Matrix 

47. Absolute, undimin¬ 

dles 

7. Executes 

29. Cars 

ished 

54. Two person com¬ 

8. Employ 

30. Refine metal 

48. Cylindrical para¬ 

bat 

9. Unconditioned 

32. Separates 

sitic worm (short.) 

55. Brief 

10. African adventure 

35. Challenges 

49. Upon 

56. Andy’s pal 

11. Record 

36. Involving a brain 

50. Al Language 

57. Strays 

12. Fit of shivering 

part 

53. Get up and go 

58. Keanu movie 

13. Over there 

38. Abdo and sped 


follower 
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SA( news & features 


^ NT: (The Manual isl 
l "Not There"! 




<pcc@ntsinc.com> 


by Phil Cox 

Phil is a member of the 
Computer Incident Advisory 
Capability (CIAC) for the 
Department of Energy. He 
also consults and writes on 
issues bridging the gap 
between UNIX and Windows 
NT. 


and Tina 


Darmohray 


Tina Darmohray, editor of 
SAGE News & Features, is a 
consultant in the area of 
Internet firewalls and net¬ 
work connections, and fre¬ 
quently gives tutorials on 
those subjects. She was a 
founding member of SAGE. 


<tmd@usenix.org> 


I (Tina) recently took a look at the secu¬ 
rity implications of running SMB appli¬ 
cations over the Internet. Initially, I 
thought the best place to look for infor¬ 
mation would be on an NT machine 
itself. I clicked on the Help button and 
proceeded to form the question. In par¬ 
ticular, I asked about network services 


and bindings. The response to my first 
query looked promising: 

binding: 

To change the order of bindings for 
selected network components 

To enable or disable binding paths for 
selected network components 

To view bindings for network 
components 

I selected the second one; and more help 
information appeared. The first part was 
a hand-holding walk-through on how to 
enable/disable bindings. Fair enough. But 
I really wanted to know exactly what ser¬ 
vices would break if I did so. I read on. 
The accompanying “Notes” section took 
the wind out of my sails in that regard, 
saying, “Do not attempt to change bind¬ 
ing settings unless you are an experienced 
network administrator familiar with the 
requirements of your network software.” 
And that was it! I poked around some 
more with the “Help” utility, but I 
couldn’t get anything more detailed than 
the warning. I began to feel frustrated, 
wondering how one was to “RTM” when 
the “M” isn’t there! 

Sheepishly, but having at least tried to 
RTM, I turned to several NT gurus to ask 
them more about what services the bind¬ 
ings affect. To my surprise, many of them 
had the same questions I did. So I decid¬ 
ed it was time to go sniffing for answers, 


and that you might be interested in what 
I found out. 

Microsoft networking was originally 
designed for small networks. When 
Microsoft decided to extend it, they 
needed a capable transport protocol to 
do so. The result is NetBIOS over TCP/IP, 
or NBT. NT machines use ports 135 and 
137-139 for all their Windows-related 
networking traffic. A breakdown of the 
ports looks like this: 


135 

loc-srv 

Location Service 

137 

netbios-ns 

NetBIOS Name 
Service 

138 

netbios-dgm 

NetBIOS 
Datagram Service 

139 

netbios-ssn 

NetBIOS Session 
Service 


Location Service is like the UNIX 
portmapper and is used to get informa¬ 
tion about the RPC programs registered 
on the machine. NetBIOS Name Service 
is used for registering and gaining infor¬ 
mation about NetBIOS names. The 
NetBIOS Datagram and Session Services 
can be viewed as UDP and TCP for 
NetBIOS packets. 

I wanted to turn off selected Microsoft 
networking services so that an NT 
machine wouldn’t be running the file and 
printer sharing (a.k.a. Server Service) 
portion of the SMB applications on the 


SAGE, the System Administrators Guild, is a 
Special Technical Group within USENIX. It is 
organized to advance the status of computer 
system administration as a profession, 
establish standards of professional excellence 
and recognize those who attain them, develop 
guidelines for improving the technical and man¬ 
agerial capabilities of members of the 
profession, and promote activities that advance 
the state of the art or the community. 

All system administrators benefit from the 
advancement and growing credibility of the pro¬ 
fession. Joining SAGE allows individuals and 
organizations to contribute to the community of 
system administrators and the professions as a 
whole. 


SAGE membership includes USENIX membership. 
SAGE members receive all USENIX member bene¬ 
fits plus others exclusive to SAGE. 

SAGE members save when registering for USENIX 
conferences and conferences co-sponsored by 
SAGE. 

SAGE publishes a series of practical booklets. 
SAGE members receive a free copy of each book¬ 
let published during their membership term. 

SAGE sponsors an annual survey of sysadmin 
salaries collated with job responsibilities. 

Results are available to members online. 

The SAGE Web site offers a members-only Jobs- 
Offered and Positions-Sought Job Center. 


SAGE STG EXECUTIVE COMMITTEE 

President: 

Hal Miller <halm@usenix.org> 

Vice-President: 

Barbara L. Dijker <barb@usenix.org> 

Secretary: 

Tim Gassaway <gassaway@usenix.org> 

Treasurer: 

Peg Schafer <peg@usenix.org> 

Members: 

Xev Gittler <xev@usenix.org> 

Geoff Halprin <geoff@usenix.org> 

Jim Hickstein <jim@usenix.org> 
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Internet, but could still run a Web server, for instance. In NT you 
create “bindings” between logical connections and services, pro¬ 
tocols, and adapters. I hypothesized that turning off pieces of 
NetBIOS on the Internet-connected adapter would disallow the 
targeted SMB services and still allow more traditional Internet 
services. I used the Network property sheet from the Control 
Panel, which contains the Bindings tab, to toggle bindings on 
and off. While I systematically turned on and off the services on 
the NT machine, I used SAMBA commands from a UNIX 
machine to look at the NT responses. 

First, I turned off the Control Panel-level Server Service. As 
expected, the NT machine didn’t answer a NetBIOS name query 
at all: 

% nmblookup -B 204.146.133.23 -S \* 

Sending queries to 204.146.133.23 
name_query failed to find name * 

I reenabled the Server Service and then, using CONTROL 
PANEL/NETWORK/BINDINGS [all adapters], turned oj^WINS 
Client [TCP/IP]. Again, as expected, the NT machine failed to 
respond to a WINS name query: 

% nmblookup -B 10.31.3.163 -S \* 

Sending queries to 10.31.3.163 
name_query failed to find name * 

Next, I reenabled the WINS Client [TCP/IP] and tested to make 
sure that the machine was responding. In the successful NetBIOS 
name query, we see that the machine lists a NetBIOS name type 
of <20>, which indicates a resource-sharing “server service”; this 
is what we would expect. Since we have a server service, we can 
now use a subsequent smbclient query to that server. 

% nmblookup -B 10.31.3.163 -S \* 

Sending queries to 10.31.3.163 
10.31.3.163 *<00> 

Looking up status of 10.31.3.163 
received 10 names 

PI<00> - B <ACTIVE> 


INet~Services 
PI <20> - 
IS-PI <00> - 
SUNNYVALE <00> - 
PI <03> - 
SUNNYVALE <le> - 
ADMINISTRATOR 
SUNNYVALE <ld> - 
. ._MS BROWSE_. 


<lc> - <GROUP> B <ACTIVE> 
B <ACTIVE> 

B <ACTIVE> 

<GROUP> B <ACTIVE> 

B <ACTIVE> 

<GROUP> B <ACTIVE> 

<03> - B <ACTIVE> 

B <ACTIVE> 

<01> - <GROUP> B <ACTIVE> 


num_good_sends=0 num good receives=0 


% smbclient -L PI -I 10.31.3.163 


Added interface ip=10.31.3.161 bcast=10.31.3.255 
nmask=255.255.255.0 


Server time is Wed Jan 27 14:32:07 1999 
Timezone is UTC-8.0 


Password: 

Domain= [SUNNYVALE] OS= [Windows NT 4.0] 
Server=[NT LAN Manager 4.0] 
security=user 


This machine has a browse list: 


Server Comment 


PI 

This machine has a workgroup list: 

Workgroup Master 

SUNNYVALE PI 

A machine configured in this way would serve File and Print 
shares on the Internet, which is the NetBIOS service I ultimately 
wanted to turn off. 

Finally, I expanded the “+” WINS Client [TCP/IP]. It shows 
three bindings that can be toggled on and off: NetBIOS 
Interface, Server, and Workstation. With the top-level WINS 
Client [TCP/IP] binding still enabled, I disabled the “Server” 
binding (below WINS Client [TCP/IP]). Note that we can still 
successfully query for the name, but the results show there is no 
longer a server service, type <20>, on the machine. As a result, 


SAGE MEMBERSHIP 
<office@usenix.org> 

SAGE ONLINE SERVICES 

Email server: <majordomo@usenix.org> 
Web: <http://www.usenix.org/sage/> 
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Collective Technologies 

Deer Run Associates 

D.E. Shaw & Co. 

Global Networking & Computing, Inc. 
Mentor Graphics Corp. 

Microsoft Research 
MindSource Software Engineers 


New Riders Press 
O’Reilly & Associates 
Remedy Corporation 
SysAdmin Magazine 
Taos Mountain 

TransQuest Technologies, Inc. 
UNIX Guru Universe (UGU) 
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the subsequent smbclient query is unsuccessful and requests for 
File and Print shares would fail. 

% nmblookup -B 10.31.3.163 -S \* 

Sending queries to 10.31.3.163 
10.31.3.163 *<00> 

Looking up status of 10.31.3.163 
received 5 names 

PI <00> - B <ACTIVE> 

INet~Services <lc> - <GROUP> B <ACTIVE> 

IS-PI <00> - B <ACTIVE> 

SUNNYVALE <00> - <GROUP> B <ACTIVE> 

PI <03> - B <ACTIVE> 

num_good_sends=0 num_good_receives=0 
% smbclient -L PI -I 10.31.3.163 
Added interface ip=10.31.3.161 bcast=10.31.3.255 
nmask=255.255.255.0 
Session request failed (131,130) with 
myname=EPSILON destname=PI 
Called name not present 

Try to connect to another name (instead of PI) 

You may find the -I option useful for this 

Don't let the lack of detailed online NT documentation dissuade 
you, or the counterintuitive naming of the WINS “Client” 
[TCP/IP] binding fool you. Expand the “+” on the WINS Client 
[TCP/IP] binding to see that underneath the “Client” lies a 
“Server,” which you can enable/disable. It might help to think of 
the WINS “Client” as NBT, and the bindings underneath as 
pieces of NetBIOS over TCP/IP. The bottom line is that there is a 
degree of granular configuration control for NBT using bindings 
to disable all or pieces of NetBIOS on a single adapter. 

References 

<hobbit@avian.org>. CIFS: Common Insecurities Fail Scrutiny. 
January, 1997. <http://199.103.168.8:1089/webl/hak/cifs.txt>. 

Richard Sharpe. Just What Is SMB? May 14, 1998. 
<http://anu.samba.org/cifs/docs/what-is-smb.html>. 


Broken Paradigm 


by Hal Miller \ 

Hal Miller is president of the SAGE STG Executive 
Committee. 

<halm@usenix.org> / 

Today I placed an order for over $5 million in computing equip¬ 
ment. There will be many times that to come - certainly the 
biggest single project I’ve ever worked on. What did I buy? 
Mostly storage: 8 terabytes. At the current growth predictions 
(much of which is already funded), I will approach, if not 
exceed, a petabyte in the next four years. 

“Neat!” you say. So did I. Then I realized: “bandwidth between 
disk and servers.” Then I thought, “Oh no! Backups!” 

Technology continues to advance. So does the demand for it. 
There is a significant gap between those rates of growth, and the 
future looks difficult for those of us tasked with using the for¬ 
mer to supply the latter. Let’s look at my real-life situation as an 
example, then see what we might do about at least breaking the 
problem down into solvable chunks, if not solve it as a whole. 

With that much disk online, in a heavy-use environment (read- 
write all over, 7 x 24 x 52, lots of users, 90-day-long jobs), get¬ 
ting data back and forth between storage media and CPU 
servers is a problem. With that many heads and spindles to 
manage (over 1000!), the seek time for a given bit of informa¬ 
tion can be long. Given that it’s all random-access filesystems 
(well, there is a database, too, just for complications, but it's “rel¬ 
atively” small), there isn’t much of a way to index around and 
cut down search time. This is all UNIX filesystem. We have 
known for years that the UFS is nearing an upper limit on direc¬ 
tory size, and it appears to have other limits as well. 

How do we deal with data integrity? There are RAID5 and mir¬ 
roring solutions, among others. Who wants to pay for the extra 
disk (let alone computer room space, power, and air condition¬ 
ing) for my mirror? 

Storage Area Networking is a solution for some of the band¬ 
width issue. But, as with the other points, how long, at this rate, 
before we outgrow that? Probably just about the time we finish 
our first backup. 

Speaking of which, the backup paradigm we all know dictates 
copying either blocks or files to tape, in some pattern to allow 
for restoral of data after hardware failures in some “reasonable” 
amount of time (plus, in some places, to allow for restoral data 
after user error). I’m putting 16 DLT7000s into this. Filling the 
tape library cabinet costs nearly $50,000 retail and will cover a 
week or so. Filling the tapes with data may take more than that 
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week. That means I need to change what 
“backup” means. I can t take anything 
offline to dump, so I need either to 
“break” a mirror or back up a “snapshot.” 
What technology I apply is really not the 
issue here (nor am I looking for those 
other large sites out there to pick on my 
scenario) - whatever that technology is, 
we have already outgrown it, or will soon. 

Enough on disk and backup. How about 
security? My site is pretty well hidden. 
There isn’t a lot of reason for people to 
come looking for us except that our 
router answers up on the Net like every¬ 
one else s. We are under scan or more 
concerted attack every few hours, per¬ 
haps more; I have no control over my 
current network and can t really see effec¬ 
tively. Fighting this, recovering from 
those incidents we’ve had (Linux mountd 
and NT, all boxes I didn’t know were 
brought in and connected), is a full-time 
job, and I don’t have anyone to apply to 
it. Tool building for IDS and other parts 
of the game proceeds, but not fast 
enough. 

Technology advances have been staving 
off “defeat” for a while and will continue 
to attempt that, but we as an industry are 
losing the battle. Demand continues to 
skyrocket. Paradigms are stretched to the 
point of breaking throughout the com¬ 
puting world. 

So what do we in SAGE do? Hard ques¬ 
tion without obvious answers. Let’s start 
with what we can’t do, and see what’s left. 
Then, remembering our job as sysadmins 
seems to include “performing magic” to 
solve whatever odd problems nobody else 
dealt with, we will try to pull yet another 
rabbit from the hat. 

We can’t develop better hardware solu¬ 
tions. We can’t fund new hardware prod¬ 
ucts. Most of us aren’t advanced hard¬ 
ware engineers. Maybe the vendors can 
do these things, but they aren’t likely to 
do so of their own volition, since they 
make good money selling what they have 
to offer us now. We need to apply our 


reputation and efforts to convincing ven¬ 
dors to join us in a “consortium” type of 
effort to devise new long-term solutions. 
We can sponsor workshops calling for 
work in progress, brainstorming sessions, 
or joint work proposals. We can fund our 
own members to work on software tools 
if they will return benefit to our commu¬ 
nity. We can put our collective experience 
together into reviewing what the require¬ 
ments really are and designing methods 
to meet them. We can get vendors to 
build it if we show them what we want. 


This year I would like to see the forma¬ 
tion of a SAGE Development Fund, and a 
SAGE Vendor Liaison function. I hope 
they make some progress before I add the 
next few dozen terabytes to my backup 
system. 

The results of the election for the seven 

( Election Results ) 


by Gale Berkowitz 

USENIX Deputy Executive Director 
<gale@usenix.org> 


SAGE Executive Committee positions of 
the USENIX Association for the 1999- 
2000 term are as follows: 


Barbara L. Dijker 542 

Hal Miller 520 

Peg Schafer 431 

Timothy Gassaway 426 

XevGittler 411 

Jim Hickstein 362 

Geoff Halprin 353 

Not elected: 

Bruce Alan Wynn 309 

David Parter 289 

Bryan MacDonald 257 


Total number of ballots mailed: 4,337 

Total number of ballots cast: 656 

Return rate: 15% 

Total number abstained: 2 

Total invalid ballots: 1 

Newly elected SAGE Executive 
Committee members took office and 
chose their own officers at their executive 
committee meeting held 22 February 
1999, in New Orleans. The new Executive 
Committee officers are: 

President: Hal Miller 
Vice President: Barb Dijker 
Secretary: Tim Gassaway 
Treasurer: Peg Schafer 

^ SAGE Certification 
Subcommittee Briefs 

\___ J 

Dan York has been added to the subcom¬ 
mittee. Dan is a technical instructor and 
training manager and is a volunteer for 
the Linux Institute, a community project 
established to develop professional certi¬ 
fication for Linux. 
<http://www.linuxinstitute.org> 

The Human Resources Research Organi¬ 
zation (HumRRO <http://www.humrro.org>) 
has been selected to conduct comprehen¬ 
sive research on system administration as 
an occupation and perform an occupa¬ 
tional analysis. 

The research will include review of exist¬ 
ing materials and data as well as active 
measures such as surveys and focus 
groups. This phase of the project should 
be completed in August. 

SAGE is also seeking sponsorship from 
individuals or organizations who wish to 
contribute to the certification efforts. 

See our Web site for more information. 
<http://www.usenix.org/sage/cert/> 
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A Round and Round 
, We Go 


by Barbara L. Dijker \ 

Barbara Dijker is Vice-President of the SAGE STG Executive 

Committee. I 

<Barb.Dijker@labyrinth.com> J 

I was struck with a bit of vertigo recently. 
Reading through the recent missives 
about the virtues and vices of certifica¬ 
tion was the culprit. 

The SAGE certification committee has 
been trying to get the topic out on the 
table so that all the views can be aired. 
There have been articles in ;logiti:. At 
LISA there was a debate. Since then, the 
sage-members mailing list has had a 
revived discussion. Finally, comments 
and pleas are being sent to the SAGE cer¬ 
tification advisory council, and discus¬ 
sion has been brewing there. 

And there it was. Dizziness, nausea. The 
cause is not that I’m sick of it the topic. 
The cause is that the discussion has not 
only come full circle, it is going in circles 
over and over again. There may be a few 
new arguments for certification related to 
changing markets. However, not a single 
new insight has been raised against certi¬ 
fication in six years. 

At the very heart of the debate, and 
expressed incessantly in an infinite num¬ 
ber of creative ways, is the question of 
whether what a system administrator 
does can be put in a tiny little box (cubi¬ 
cle, box, cubicle, box) and neatly labeled, 
categorized, and evaluated. Gut reaction 
to that idea is pure repulsion. It threatens 
our identity. It undermines all the hard 
work we’ve done to get this far. 

Most system administrators practicing 
today have formal training (if any) in a 
vaguely related area and learned every¬ 
thing the hard way. My education is in 
physics and I used to program space¬ 
crafts. What brought most of us into the 
“profession” was a drive to figure things 


out, the ability to learn things that 
weren’t well documented, and the naivete 
to think others would be grateful. How 
do we measure that? Few of us have had 
any formal education in system adminis¬ 
tration - because it didn’t exist. Due to 
our experiences, we find it difficult to 
consider that one can really learn system 
administration any other way. 

At the same time, we all complain about 
being overworked. I personally can’t wait 
for genetic cloning. Then I (and my six 
other clones) will be able to work some¬ 
thing less than a 12-hour day and have a 
“normal” evening at home. There are 
now, after about 10 years, a good number 
of excellent tutorials, classes, and books 
on various aspects of system administra¬ 
tion. We seem to be able to impart our 
knowledge base. We should be able to 
evaluate one’s application of that knowl¬ 
edge. We should be able to certify that 
evaluation. There appears to be value in 
doing that. 

Every “profession” goes through a matu¬ 
ration process. I have no doubt the first 
doctors were just as arrogant that their 
skills could not be duplicated or mea¬ 
sured. Some still are. Same with lawyers 
or any other group doing work which 
involves significant experience and cogni¬ 
tive process. But as the work evolves, the 
knowledge and practice of the work 
evolve and can more readily be imparted 
and evaluated. Think about where we 
might be without certification of existing 
professions like nursing. Certification is 
never a replacement for apprenticeship or 
on-the-job experience, it’s a foundation 
upon which to build. 

Maybe system administration isn’t ready 
for certification yet. But one day it will 
be. How will we know when? Probably 
when the first-generation system admin¬ 
istrators are long gone and the new gen¬ 
eration can’t remember their old argu¬ 
ments against it. The new generation will 
become system administrators not 
through the School of Hard Knocks but 


through one of any number of training 
programs already available, which are 
improving all the time. 



Photo by Mark Mellis 

“Presented to Tina Darmohray for her 
dedication and tireless efforts which pro¬ 
mote understanding and recognition of 
the System Administration Profession” 


The 1998 Outstanding 
Achievement Award 
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system profiles with 

syssumm 

I joined the corporate server support group for SAIC as a senior UNIX system 
administrator in June of 1998. Immediately I began to get lots of questions 
about the configuration of various UNIX boxes that the group had on hand: 

“How much memory?” 

“How much disk storage?” 

“How are the file systems laid out?” 

“Is it running NIS?” 

I began to profile a number of these systems by creating a template in an MS Word doc¬ 
ument that described the general identity, hardware, network, and software configura¬ 
tion of each of these systems. Up to seven or eight systems, this arrangement worked 
just fine. However, we kept getting involved with more systems (or people saw our pro¬ 
files and asked us to do ones for them). I’d also think of another piece of information to 
add to the profile and would have to go through and edit each of the profiles to keep 
them in sync. All in all, the unautomated task of maintaining these profiles was becom¬ 
ing unwieldy. 

I began looking around for existing tools for automatically profiling UNIX systems 
(especially useful when you have 40 or 50 servers from several vendors, with varying 
configurations, rather than hundreds of identical workstations), but I found very little 
existing software. Russ Allbery pointed me to one C program, Syslnfo, but it seemed to 
deal mostly with tunable parameters. 

Ultimately, I wrote a series of Perl scripts to try to automate the profiling process, most¬ 
ly as a “proof of concept.” The requirements were: 

■ Collect information using Perl 5 or later. 

■ Don’t require any additional CPAN modules, if possible. 

■ Avoid exotic “deep magic” system calls, if possible. 

■ Allow certain fields to be filled in by the sysadmin. 

■ Allow these sysadmin fields to “include” multiple external files (in the same directo¬ 
ry), including text and graphic files. 

■ Allow the fields collected to be extended. 

■ Be aware of security concerns, since some of the information can only be collected if 
you are running as root. 

■ Allow transmission of the collected information by email (so the process can be auto¬ 
mated with cron). 

■ The system that receives the messages needs to able to have an “agent” process these 
incoming messages without human intervention. 

■ Allow the profiles to be displayed by Web browser. 

■ The Web server displaying the profiles needs to able to support Perl CGI scripts. 

■ Allow the final format of the profile to be determined by a format file rather than be 
hard-coded. 



by Bruce W. 
Mohler 

Bruce is a Senior UNIX 
System Administrator for 
SAIC, where he is on the 
Systems Engineering team 
and provides support for 
Corporate Solaris, HP-UX, 
and Linux systems. 


<bruce.w.mohler@saic.com> 


J 



As of this writing, these scripts work for HP-UX 9.X and 10.X and for Linux systems. By 
the time you read this, I hope to have extended them to Solaris 2.5 and 2.6 systems, and 
perhaps to Windows NT. One set of scripts automatically generates profile information. 
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Since I'm probably not the only person to 
get these kinds of questions, I am won¬ 
dering if there is any interest within the 
SAGE community to collaborate on this 
software to extend it, refine it, port it to 
other "species" of UNIX, and make it 
available to the SAGE community as a 
whole. I've gotten the permission of my 
manager to release the Perl source if 
there's sufficient interest. If you're interest¬ 
ed in collaborating on this software, or if 
you'd just like to keep track of the 
progress of the software and use it once 
it's more functional, please contact me at 
<bruce.w.mohler@saic.com>. 


The profile information is emailed to another system where a Perl script handles the 
incoming message. If its the first profile for this system, then a subdirectory is created 
and the profile is stored there; otherwise it is merged in with existing profile informa¬ 
tion for that system. A third Perl script serves as a CGI script to generate a form to 
query for system profiles, summarize all of the profiles that exist, and create the HTML 
for the profile of a requested system. The code that generates the profile output is driv¬ 
en by a format file so that a local site can customize the format of its Web-based system 
profiles without having to hack a single Perl script. This entire process, while perhaps 
not elegant, works end-to-end today. The rest of this article provides more detail about 
the software, which I call syssumm, as it exists. It is taken from the existing README 
file and includes some examples of the files and Web pages generated. 

Components Of syssumm 

The syssumm software has two parts: the software that runs on the remote systems and 
collects the configuration information, and the software that runs on the Web server, 
processing incoming messages and merging the information into the appropriate subdi¬ 
rectories. 

The “remote” software is composed of a Perl “driver” script called syssumm.pl that 
makes calls into OS-specific Perl modules, plus a common Perl module of utility sub¬ 
routines. In general, there is a Perl module for each vendor’s OS. For example, for 
Solaris, HP-UX, and Linux you would have SunOS.pm, HPUX.pm, and LINUX.pm. 

Only the module for the local OS running on the system is “used” by the syssumm.pl 
script. For example, if you are running the script on a Sun box, the HPUX.pm and 
LINUX.pm modules are never loaded. When it came to extracting information about the 
remote system, keeping things simple was valued over being extremely clever. 

The “Web server” software is composed of a Perl script called incoming.pl that 
extracts and processes incoming email messages, and a CGI Perl script called 
sysquery.pl that generates an HTML form, then processes the returned results of 
that form and displays a system summary. The CGI script also generates a Web page 
summarizing all existing system profiles. 

Software Requirements 

On the remote systems the only requirement is Perl 5.x. No additional CPAN Perl 
modules are required to run the profiling script. On the Web-server system the only 
requirement is a sendmail-like MDA that allows the processing of messages through a 
. forward mechanism and Perl 5.x. The Web server must support handling CGI scripts 
written in Perl. 

How Does the Software Work? 

First, the system administrator sets up the Web-server software, creating a new account 
on the system where the Web server exists with a userid of “syssumm”. The output of 
syssumm.pi would be emailed to this account. 

Next, in the home directory of the “syssumm” account, the system administrator creates 
a . forward file which delivers each incoming message to the incoming.pl script that 
processes the messages and places them in the appropriate directory under the Web 
server. 

The system administrator loads the “remote” software onto a system to be profiled and 
either manually runs the syssumm.pl script or sets up cron to run periodically. The 
“-m” command-line option specifies the email address to which to mail the output. 
(This should be “syssumm@Webserver.yada.yada.yada”) 
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Next, a pointy-haired user brings up the form (generated by sys- 
query.pl) and enters the name of a system, which is sent back to 
the same sysquery.pl script to interpret. If the name equates to 
an existing system for which there is profile information, then that 
information is formatted as HTML and returned to the requestors 
browser. Figure 1 shows what the query form looks like. 

A format file is used to control the appearance of the Web page so 
that you can choose what fields to display in your system sum¬ 
maries and in what order they appear. Figure 2 shows what a sys¬ 
tem profile would look like. 

The existing sysquery.pl script is intended to be an example of 
how to query and retrieve system profiles; obviously, your local 
Web pages will have a different look. 



Figure 1. Query form 


Appearance of the Generated Data 

The information generated by syssumm.pl and processed by incoming.pl is a basic 
ASCII flat file. Each line is self-sufficient and is composed of: 

category:sub-category:value 

The following categories have been “hard-coded” into the script that creates the out¬ 
put: 

General 

Hardware 

Network 

Software 

Comment 

Within each category are subcategories. For example, within the General category, 
you’ll find: 

NodeName 

Organization 

Vendor 

Model 



Figure 2. System summary 


Some of the subcategories are “indexed.” For example, the subcategories for disks and 
tapes look like: 


Disks-0 
Disks-1 
Disks-2 


Tapes-0 
Tapes-1 


Each line represents an individual device. 

Entries in the Comment category are optional. They would be remarks such as “Run as 
root to get more information.” 


April 1999 ;login: 


11 


SAGE NEWS & FEATURES 













































Note that certain lines 
contain the value 
“PROTECTED”. These 
fields signify that this 
information cannot really 
be figured out by a stupid 
Perl script and needs the 
omniscience of a human 
system administrator. 


This is what the actual lines would look like for a hypothetical system: 

General:NodeName:ornithomimus 
General:Organization:PROTECTED 
General:Vendor:Sun 
General:Model:Geewhiz 
GenerakHostld: 12345678 


Note that certain lines contain the value “PROTECTED”. These fields signify that this 
information cannot really be figured out by a stupid Perl script and needs the omni¬ 
science of a human system administrator. 

If a system has been profiled in the past and an updated profile is sent to the Web server 
system, PROTECTED fields will not overwrite prior contents. This is especially effective 
when the system administrator has gone in and provided the information that the Perl 
script couldn't figure out. This feature protects the work that they've done so that they 
don't need to fill in those fields again. 

Note that when the incondng.pl script processes PROTECTED fields (especially for 
the first time), it changes “PROTECTED” to “To be provided” (to make the report more 
readable). 


Fields Collected by the Profiling Scripts 

The following list summarizes the categories and subcategories collected. 
General 


NodeName 

Organization 

Vendor 

Model 

Hostld 

SystemHandle 

Location 

DateSystemlnstalled 

LargerPicture 

Hardware 

Processors 

Memory 

Disks 

Tapes 

Console 

OtherPeripherals 

Network 

DomainName 

DefaultRouter 

NameServer 

Networklnterfaces 

HardwareNetworkAddress 


(PROTECTED) 


(PROTECTED) 

(PROTECTED) 

(PROTECTED) 

(PROTECTED) 


(Indexed) 

(Indexed) 

(PROTECTED) 

(PROTECTED) 


(Indexed) 
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Software 


OsName 


OsVersion 


NbrOfLicenses 


Patches 

(Indexed) 

RunningLpsched 


Printers 


RunningSendmail 


RunningNfs 


RunningAutomount 


RunningNis 


RunningNameserver 


RunningSamba 


GraphicalUserlnterface 


FontServer 


LocalFileSystems 


RemoteFileSystems 


RunningAccounting 


Applicationslnstalled 

(PROTECTED) 

InstallationProcedure 

(PROTECTED) 

DataFlowsTo 

(PROTECTED) 

DataFlowsFrom 

(PROTECTED) 


Comments 

The “LargerPicture,” “DataFlowTo,” and “DataFlowFrom” PROTECTED fields are 
intended to provide the context that the system resides in within your local data center. 

Note that while items such as patches are summarized, tunable parameters are not. This 
may be an area of interest for some system administrators to profile. 

Multi-line Values and Extensibility 

The system is extensible in the sense that you can add fields to the profile. When 
incoming.pl merges a new profile into an old one, non-PROTECTED fields “over¬ 
write” existing fields, existing PROTECTED fields are unchanged, and new PROTECT¬ 
ED fields are installed in the profile as “To be provided.” As long as you add them to the 
format file that controls the CGI/HTML output, they will appear in any system profiles 
requested after that point. 

PROTECTED fields are allowed to point to one or more files that may contain text or 
(Web-compatible) graphics. In fact, if the value begins with a “.” (period) or 7” (for¬ 
ward slash) - indicating a relative or an absolute path, respectively - and if that file 
exists, then it will include the contents of the file in place of the value. 

Summary 

These scripts are the humble beginnings of a system that can profile general, hardware, 
network, and software configuration information about UNIX systems and display the 
output on any Web browser. 


These scripts are the 
humble beginnings of a 
system that can profile 
general, hardware, net¬ 
work, and software 
configuration information 
about UNIX systems and 
display the output on any 
Web browser. 



April 1999 ;login: 


13 


SAGE NEWS & FEATURES 









<dclark@mindsrc.com> 


by David Clark 

Dave Clark is president and 
founder of MindSource 
Software Engineers, a techni¬ 
cal-talent company devoted 
to staffing for system and 
network administration. Dave 
is a former UNIX systems 
engineer and administrator. 



resume your resume 
writing 

Of all the stumbling blocks for techies in search of a career change, writing a 
resume is high on the list of possible points of procrastination. Fortunately, 
unlike many software professionals, I've always enjoyed writing and formatting 
resumes; it presents a challenge to me, and I enjoy the results. Over the past 
seven years as president of a technical-staffing company, I’ve written many 
thousands of resumes, most of them for UNIX system administrators. I'll share a 
few tips and some useful anecdotes that may make writing yours a more palat¬ 
able, and profitable, proposition. 

The primary purpose of a resume is to land an interview. However, the resume, or cur¬ 
riculum vitae (CV), is a part of an overall marketing plan that will assist you in renavi¬ 
gating the waters as your job demands. Resumes are also maintained by human- 
resources (HR) staff as a credential for your claimed abilities after you are hired. 
Resumes of prominent individuals in corporations are often provided for contractual 
purposes such as engaging in a service-level agreement. You’ll need a resume for many 
purposes as you proceed along your career path, so make it a good one. 

Writing a resume is pretty straightforward. First, figure out who your target audience is 
and what they need to know, and then aim it towards them. For most of us, the target 
audience is someone who can bring us in for an interview. Avoid HR people and 
resume mills as your target interview audience, unless you are desperate. Instead, I 
advise creating resumes for hiring managers and finding ways to get your resume 
directly into their hands. 

In my mind, HR stands for Huge Roadblock. The following anecdote backs this up. A 
friend of mine, who was as a manager at a large database vendor, was discouraged 
because she wasn’t getting any candidates for her open positions. Another friend of 
mine was looking for work as a database administrator and was a perfect fit for the 
position. It turned out that he had sent his resume to her company almost a dozen 
times in the past year, mentioning the applicable job numbers in his cover letters, with 
no response. When he was referred through word of mouth, the hiring manager 
screened and hired him on the spot. HR people are generally capable of producing 
swank holiday parties and finding a dentist in your HMO group, but don’t count on 
them to help you find a job. So, if possible, always aim for the hiring manager. 

System administrators should focus their resumes on two criteria: first, whether you are 
looking for contract or full-time permanent employment and second, your level of 
seniority. Resumes are typically no longer than two pages. Try to focus what you really 
need to say in that amount of space. 

Cover Letters and Packaging 

Make your presentation clean and to the point. Include a very brief cover letter. Don’t 
create a point of objection by mentioning your gun collection or religious beliefs. The 
cover letter is only an introduction; it will probably get tossed or stapled to the back of 
the resume. 
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<date> 

Greetings, 

In response to your ad in the Washington Post for a Senior UNIX Analyst I am includ¬ 
ing my resume for your consideration. I may be reached after hours at my home 
number or via email to my personal account. 

<at this point, if you have a brief item that warrants special attention, set it out as a 
hook, but don’t give too much away> 

I have worked at your competitor, Xylkorp, for three years and was responsible for 
architecting their entire facility. 

Thank you, 

<sign here> 

Remember that your resume will probably be “processed”; try to anticipate the clerical 
person on the receiving end and avoid folding the resume more than twice. For another 
32 cents you can send the resume in a 9x12 envelope with no folds; this stands out. 

Distribution 

I’ve heard of people sending over 500 resumes out at a time in a mass marketing effort. 
This is a great big mistake. People who receive resumes don’t want to feel that you have 
disseminated the document widely. In fact, the hiring staff wants the impression that 
this is a special overture for them alone. Instead, plan your career moves, select specific 
companies that interest you and find ways to get the attention of the right people inside 
that company. Do a little shopping while you are still happily employed; peruse the 
want ads or newsgroups. Poke around. You’re better being selective in your job hunt. It’s 
a better use of your time and everyone else’s. 

Format 

It’s always easiest to start from an example, so I’ll include one of mine. Historically, 
resumes were sent in a #10 envelope and folded twice into three sections. Because of 
this, the top third of the resume is considered the place to catch the reader’s attention. 

I’d avoid using an “objectives” header. It can be too limiting and provide a good excuse 
not to read the rest of the resume. Throughout, use specifics, depth, and consistency to 
catch the eye of the hiring manager. For a particular position, I start with the job 
requirements and follow them as an outline, insuring I’ve addressed as many applicable 
points as possible. 

Dave's Resume Format 

NAME 

CONTACT INFO 
CRITICAL SKILLS 

A complete paragraph that concisely explains your abilities and interests. This section 
can easily be changed to tailor the resume to specific jobs. Move the most relevant 
material to the first few sentences. 

» fewer than ten sentences « 

Highlights, optional acronyms, buzz words supporting critical skills, e.g., 


System administrators 
should focus their resumes 
on two criteria: first, 
whether you are looking for 
contract or full-time 
permanent employment 
and second, your level of 
seniority. 
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Weave a thread. Show that 
the previous job led to 
your taking on new and 
different responsibilities. 


OS: Solaris 2.6, HP/UX 9.x, Irix 6, Linux, FreeBSD, NT 
Hardware: Sparc, board level and FRUs 
Scripting: Perl 5.x, Ksh, light TCL 
Facilities: DNS, Sendmail, RADIUS 


» five lines max « 
EDUCATION 


College degree and recent germane technical training 


» three lines max; don’t list your graduation date « 


PROFESSIONAL WORK EXPERIENCE 
» HEADING « 


Company_Name 


(6 months) 


Senior Unix System Administrator. Responsible for ... 


OR 


Company_Name 
Sept ’98 to present 

Operations Engineer. Provided back-line support for ... 

» Use parallel construction. « 

There are three or four forms of listing job experience, just use one of them 
consistently. 

» NEVER refer to yourself in the third person, e.g., “Mr. Jones created a 
backup system to handle.” « 

» Unless you are in a management role, it’s generally advisable not to toot 
your horn about cutting costs, increasing efficiency by a multiple, etc. « 

» Weave a thread. Show that the previous job led to your taking on new and 
different responsibilities.« 

» Show lots of details. Managers love specifics, e.g., 

WRONG: “Worked at way big site with big, cool computers.” 
CORRECT: “Served in a team which managed a 7x24 site with over 
500 Solaris 2.6 desktop systems and 24 servers.” 

WRONG: “Wrote backup systems.” 

CORRECT: “Modified existing Perl 4.x backup systems to interface 
Legato version 2 software with HP, and Solaris file systems totaling 
700 gigabytes.” « 

» Put the most bulk in the most recent job. For sysadmins I can usually fit 


the two most recent jobs on the front page together with critical skills and 
education. « 

» Older jobs should be pared down to shorter paragraphs. If the job is over 
10 years old, consider leaving it off or doing an honorable mention with a 
modest sentence. No one really cares about how many lines of CPM assembler 
you wrote in 1982.« 
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OTHER TRAINING, CERTIFICATIONS, AND ASSOCIATIONS 


Noncollege work 
Historical training classes 
USENIX, IEEE Membership 

COURSEWORK 

Avoid listing coursework on a resume unless this is your first job out of school. Refer 
only to coursework that is specific to the job field you are applying for, e.g., “Thesis: 3D 
rendering algorithms for multi-tasking operating systems” 

HOBBIES / PERSONAL INTERESTS 

National Rifle Association 
US Postal Workers Support Group 
Tai-Kwak-Wo Kick Boxing Champions of Gilroy 
Abott Labs Pharmaceutical Research Volunteer 

»Consider what you want to tell the reader in this section of your resume. If 
you must show personal information, limit it.« 

If you have been out of school for more than five years, avoid this section. Hobbies and 
personal interests do not belong on the resume of a consultant. 

Tips and Observations 

If you are surface-mailing your resume, avoid using extra-fancy paper stocks; most of 
the time it will be photocopied. Use a serif font between 11 and 14 point. Unless you are 
an inventor or marketing whiz, avoid gimmicks like rubber gloves, petuli oil, pictures, 
fancy images, unusual designs, or excessive colors. Leave a sufficient amount of white 
space and borders. 

Unless you speak French, don't put the acute accent marks over the word “resume,” even 
if you did have to write a PostScript subroutine to make it perfect. Its an affectation 
and a distraction. 

Be warned that an emailed resume can be altered easily. Ditto the resumes posted via 
URL. Don’t send your references until you have talked to a human, there is a strong 
interest in that company, and that company has explicitly asked for them. 

Finally, be particular about quality details, spelling, formatting, copy quality, and 
legibility. 
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<Brent@Covad.COM> 


bandwidth versus 
latency 

Helping Developers Understand Network 
Performance Limitations 

V___ J 

Software developers and integrators frequently don’t understand what perfor¬ 
mance they can reasonably expect from a corporate network. This often leads 
them to design and implement software that works great on a LAN but is utterly 
unusable when deployed across a WAN. To address this at our company, I put 
together the following short email message to our developers, explaining the 
basics of network performance (particularly the difference between bandwidth 
and latency), describing how this affects client-server applications, outlining 
what performance they should expect from our corporate network, and suggest¬ 
ing guidelines for their client-server development efforts. If you face the same 
kind of constraints in your business, you might consider providing something 
similar (with numbers changed appropriately) to help your own developers 
understand their environment better. You can also apply this same principle in 
other areas, such as CPU, memory, or disk performance, to help folks under¬ 
stand what they can expect. 

I’d like to suggest that any application or service that you’re designing to run over our 
corporate WAN should be designed so that it performs adequately under the follow¬ 
ing conditions: 

■ Single user 

■ Single task (no other applications using the WAN link simultaneously) 

■ 128 kb/s bandwidth 

■ 100 ms latency (one way) 

We will have much more bandwidth than that available to most corporate sites (but 
not all sites, and not all the time). We should use that bandwidth, however, to accom¬ 
modate multiple simultaneous sessions (from multiple users, from multitasking indi¬ 
viduals, or from some combination of both), rather than counting on it to provide 
adequate performance for any individual session. 

If you keep these guidelines in mind as you develop applications and services, it will 
go a long way toward ensuring that our applications and services work well across 
our entire corporate WAN. 

I’m not suggesting these numbers arbitrarily. There is some logic behind them: 128 
kb/s bandwidth is what you can expect out of an ISDN line. Right now, for cost rea¬ 
sons, all our corporate regional offices are limited to 128 kb/s burst speed for band¬ 
width, on a public-carrier frame relay WAN. Each office should have more bandwidth 
available in a few months when [something confidential] happens, but they’ll still 
have backup connectivity that’s limited to 128 kb/s (either their current frame relay 
service, or dial-on-demand ISDN service). And they’ll still have folks who want to use 
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the apps from home, some of whom will be limited to only ISDN bandwidth (128 
kb/s). 

As for latency, 100 ms one-way is about what we can expect out of our coast-to-coast 
frame relay-based IP WAN during peak usage periods each day. One-way latency on 
our LAN at headquarters is usually 3-5 ms or less. A task where a client/server appli¬ 
cation or service does 10 round-trip conversations between the client and the server 
on a path with 5 ms one-way latency can still complete in 100 ms (1/10th of a sec¬ 
ond), which “feels fast”; that same task on a link with 100 ms latency will take 2000 
ms (2 seconds), which almost certainly “feels slow.” ... 

The upshot of this for software developers is: 

■ Don't develop applications or services that need to transfer data between the client 
and server at more than 128 kb/s in order to avoid “feeling slow.” 

■ Don't develop applications or services that require lots of little back-and-forth mes¬ 
sages between the client and server; you'll get eaten alive by the latency. 

Remember to include whatever numbers are appropriate for your site and to clearly 
outline the performance constraints and expectations plus the technical explanation for 
them. Since it's easier for everyone to accommodate design specifications up front, 
rather than retrofit applications after the fact, everyone will appreciate this “heads up” 
information. 


Help your own developers 
understand their environ¬ 
ment better. You can also 
apply this same principle 
in other areas, such as 
CPU, memory, or disk 
performance, to help folks 
understand what they can 
expect. 


April 1999 ;login: 


19 


SAGE NEWS & FEATURES 









<joseph@5sigma.com> 


by Joseph N. Hall 

Joseph N. Hall is the 
author of Effective Perl 
Programming (Addison- 
Wesley, 1998). He teaches 
Perl classes, consults, and 
plays a lot of golf in his 
spare time. 


effective perl 
programming 


Analysis Without Paralysis 

Perl is particularly well-suited for the analysis of log files and other similarly 
organized text. Using Perl, you can search files for entries meeting particular 
requirements (as with the grep command, but more powerfully), you can build 
data structures that capture and organize the contents of files, and you can 
summarize or restructure data in the files. In this column, I’ll illustrate the 
techniques that Perl programmers use to perform these tasks, starting with the 
basics, then proceeding to more sophisticated examples. 

Processing One Line at a Time 

Many log files are organized so that each line is a separate “record” in the log. Generally, 
you want to process this type of file one line at a time. The idiom for this in Perl is the 
ubiquitous: 

open FILEHANDLE, "/my/file" or die "couldn't open: $!"; 
while (<FILEHANDLE>) { 

# do something with the contents of $_ 

} 

close (FILEHANDLE); 

The while (<FILEHANDLE> ) loop is a shorthand way of writing: 

while (defined ($_ = <FILEHANDLE>) ) { 

# do something with the contents of $_ 

} 

Both these snippets read a line at a time into $_ from the file opened as FILEHANDLE. 
Inside the while loop, you put whatever code is necessary to process a line of the file. 
For example, to print all the lines containing the word 5sigma, you could write: 

while (<FILEHANDLE>) { 

print if /\b5sigma\b/; # print and // both default to $_ 

} 

You might choose to extract information during the loop and then print it out in some 
other form after the file has been completely read. Often, you will want to read data into 
a hash as part of this process. For example, to parse the passwd file and create hashes 
that map user names to user ids and vice versa - a bit of makework, mind you, because 
this capability already exists in the built-in getpwnam and getpwuid operators - you 
might write: 

open PASSWD, "/etc/passwd" or die "couldn't open passwd: $!" ; 

while (<PASSWD>) { 

chop; 

my ($name, $dummy, $uid) = split /:/; # split defaults to $_ 

$uid{$name} = $uid; # add a new name/uid to %uid 
$name{$uid} = $name; # add a new name/uid to %name 

} 

close (PASSWD); 

for (sort keys %uid) { print "uid for $__ is $uid{$_}\n" } 
for (sort ($a <=> $b} keys %name) 

{ print "name for $_ is $name{$_}\n" } 

Note that I am spelling foreach as for here. The foreach and for tokens are 
interchangeable. 
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The split operator breaks each line of the password file into its constituent fields. We 
assign the first and third fields to $name and $uid, respectively, then use those values to 
create hashes. (Note that there is no conflict between the scalar variables $name and 
$uid and the hashes %name and %uid - they are independent.) The last two lines print 
out the contents of the two hashes. Because the keys of %name are numeric user ids, they 
must be sorted in numeric order rather than the default “ASCIIbetical” (character-by¬ 
character) order; thus the sort block {$a <=> $b}. 

Reading Multi-Line Records 

You may occasionally encounter text files where records occupy several lines and are set 
off from one another by delimiting lines. Perl’s scalar .. operator, also known as the 
“flop” operator, is sometimes helpful in dealing with this type of file. Suppose, for 
example, that you are parsing a file consisting of records that look like the following: 

begin user joebloe 
name: Joseph N. Hall 
phone: 555-1212 
email: joseph@5sigma.com 
end user 

The following code will scan input one line at a time and print out only the record(s) 
for the user joebloe: 

while (<>) { # read from standard input or files in @ARGV 

print if / A begin\s+user\s+joebloe\b/ .. / A end\s+user/; 

> 

The flop operator works by maintaining a “state” that is either true or false. Each flop 
operator in a program has its own state. The flop operator starts out yielding false, and 
first yields true when the lefthand expression evaluates to true. It then yields true until 
the righthand expression evaluates to false. Its a slightly obscure feature of Perl, but, as 
you can see, when its right for the job it can yield very succinct programs. 


Sometimes you may want 
to read the entire contents 
of a file all at once - to do 
some multi-line pattern 
matching , or for efficiency ; 
or “just because. ” 


Reading a File All at Once 

Perl programmers tend to read files one line at a time - Perl has a lot of features that 
work well on “line at a time” input, and if lines have a known maximum length, you 
can be assured that a program reading one line at a time can handle a file of any length. 
However, sometimes you may want to read the entire contents of a file all at once - to 
do some multi-line pattern matching, or for efficiency, or “just because.” The customary 
way to read all of a file is to clear the line separator variable $/ ’ If $/ has the value 
undef, the line input operator <> will read the entire contents of input into a Sscalar 
rather than a single line from it. Here is an example where we read the password file all 
at once and create a hash of the names and user ids in one fell swoop: 

{ 

open PASSWD, "/etc/passwd" or die "couldn't open passwd: $!"; 
my $oldfh = select PASSWD; 

local $/; # undefs $/ for PASSWD in this block 

select $oldfh; # restore previous default filehandle 
%uid = (<PASSWD> =~ / A (.*?):.*?:(.*?):/mg); # all at once! 

} 

for (sort keys %uid) { print "uid for $_ is $uid{$_}\n" } 

There is a different $/ for each filehandle. In this example we have to use the select 
operator to make PASSWD the current filehandle so that we can change the value of its 
$/. Next, we restore the previous current filehandle (probably STDIN), then read the 
entire contents of the password file and perform a match that returns a list of name 
and user ids suitable for initializing a hash - note the /m and /g options in the match 
operator. 
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Searching Simultaneously for Multiple Patterns 

Sometimes you will want to search a file for lines matching one of several patterns. 
Certainly, you could write something like: 

while (<FILEHANDLE>) { 

print if /\bjoseph\b/i or /\bhall\b/i; 

} 

You can interpolate variables into match operators if you want to specify patterns at 
runtime: 

($patl, $pat2) = qw( (?i)\bjoseph\b (?i)\bhall\b); 
while (<FILEHANDLE>) { 

print if /$patl/ or /$pat2/; # (?i) gives case-insensitivity 

} 

You have to be concerned about a couple of things when interpolating variables into 
match operators. First, the variables must contain legal regular-expression syntax. For 
example, if $patl in the example above contains : - ), a fatal error will occur at runtime 
because /:-)/ is not a legal regular expression. (The quotemeta operator can be helpful 
in these cases - see the perlfunc man page.) Second, when a match operator contains 
variables, the regular expression is recompiled each time that the match operator is 
used, generally resulting in slower performance. The /o (“compile once”) option causes 
a regular expression containing variables to be compiled only once: 

($patl, $pat2) = qw( (?i)\bjoseph\b (?i)\bhall\b); 
while (<FILEHANDLE>) { 

print if /$patl/o or /$pat2/o; 

} 

To get this to work with arbitrary lists of patterns, though, you need to resort to some 
trickery. The usual method is to use a string eval returning an anonymous subroutine 
in combination with a /o match operator. This makes it possible to construct a list of 
anonymous subroutines, each of which searches its argument for a particular pattern: 

@pats = qw( (?i)\bjoseph\b (?i)\bnathan\b (?i)\bhall\b); 

©search = map { eval q{ my $pat = $_; 

sub {$_[0] =~ /$pat/o} } } ©pats; 
while (defined ($line = <FILEHANDLE>)) { 
for (©search) { 

if ($_->($line)) { $count++; last } 

} 

} 

print "matches = $count\n"; 

You could also construct a single pattern that matches an alternation of the original list 
of patterns. That might appear to be more efficient at first, but in my benchmarks it 
doesn't seem to make a large difference. 

If you are using Perl 5.005, an alternative (and more readily comprehensible) means of 
interpolating regular expressions is available through the qr (quote regex) operator. 
When 5.005 is widely adopted, qr will become the most appropriate mechanism for 
engineering solutions to this type of problem. 

Reading Data into Nested Structures 

You can handle some common tasks by reading data into one or two ordinary hashes, 
but for more complex analysis tasks you may need to use nested hashes and/or arrays. 

In order to work with nested data structures, you will need an understanding of refer¬ 
ence syntax (too complicated to cover here, sorry!). You should also understand auto- 
vivification in Perl. Auto-vivification is a mechanism by which structures linked by ref- 
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erences are created automatically. To illustrate, let’s suppose that the variable $stats 
has the value undef. Now, consider the following line of Perl: 

$stats->{$host} = {Bytes => $bytes}; 

We are using $stats like a hash reference. Even though $stats is undefined, when we 
assign a value to $stats->{$host}, Perl will automatically create the underlying hash 
and assign a reference to it to $stats. Now, $stats->{$host) {Bytes} will return 
whatever the value of $bytes was. Auto-vivification also works for arbitrarily deeply 
nested structures. We could have written the above as: 

$stats->{$host}{Bytes} = $bytes; 

And, in fact, thats the idiomatic way to do it in Perl. Nested structures are useful when 
you must summarize or reorganize the data in a file. As an example, let’s look at analyz¬ 
ing httpd logs in Common Log Format (CLF). Let’s create a list of all the different hosts 
that connected to the Web server on each day, and print total bytes for each: 

my $log = "access_log"; 

open LOG, $log or die "Couldn't open $log: $!"; 
my %bytes; 
while (<LOG>) { 

# split line into various fields 
my $line; 

my ($host, $date, $request, $status, $bytes) = 

/(\S+) .*?\[([ A :]+) .*?\]\s+" (.*?)"\s+(\S+)\s+(\S+)/; 

# truncate host name to domain.domain if necessary 
($host) = ($host =~ /([ A .\n] + (?:\.[ A .\n]+)?)$/) if 

$host =~ /[a-z]/i; 

next if $bytes =~ /\D/; # skip if $bytes non-num, e.g. 

$by tes{$date}{$hos t} + = $bytes; 

} 

for my $date (sort keys %bytes) { 
print "$date:\n"; 
for my $host ( 

sort {$bytes{$date}{$b} <=> $bytes{$date}{$a}} 
keys %{$bytes{$date}}) { 
print " $host: $bytes{$date}{$host} bytes\n"; 

} 

} 

The first part of this program (the while loop) reads in the log file a line at a time, 
extracting the various interesting parts of each line. (We aren’t using $status or 
$request here, but I left them in for clarity.) The hostname is cleaned up, and lines 
where no bytes were transferred are ignored; then the number of bytes is added to an 
“accumulator” in a nested hash. A transfer of 5,000 bytes on 02/Jan/1999 from a host 
named foo.bar would be added like this: 

$bytes{"02/Jan/1999"}{"foo.bar"} += 5000; 

Auto-vivification will create the appropriate underlying hashes and references anew if 
there is no existing entry for that date and/or host. The second part of the program 
sorts and prints out the dates and hostnames in a useful format, ordered first by date 
(alphabetically, for simplicity’s sake) and then in descending order by number of bytes 
transferred. 

I’ll finish with one more example. This time, let’s look through the log and print out 
stats for the five largest transfers: 


Auto-vivification also works 
for arbitrarily deeply 
nested structures. . . . 
Nested structures are 
useful when you must 
summarize or reorganize 
the data in a file. 
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my $log = "/ etc/httpd/logs/access__log''; 

open LOG, $log or die "Couldn’t open $log: $!"; 

# initialize so -w is happy 

my ©largest = map { +{ Bytes => 0 } } 1..5; 

while (<LOG>) { 

# split line into various fields 

my ($host, $time, $request, $status, $bytes) = 

/(\S+).*?\[(.*?)\3\s+"(.*?)"\s+(\S+)\s+(\S+)/; 

# truncate host name to domain.domain if necessary 
($host) = ($host =~ /([ A .\n]+(?:\.[ A .\n]+)?)$/); 

next if $bytes =~ /\D/; # skip if $bytes non-num, e.g. 

# keep track of largest so far; re-sort if changed 
if ($largest[0]{Bytes} <= $bytes) { 

©largest = sort { $b->{Bytes} <=> $a->{Bytes} } 

©largest[0..3] , 

{ Host => $host, Time => $time, Request => $request, 

Bytes => $bytes 

} 

} 

} 

for (©largest) { 

print "$__->{Host}: $_->{Bytes} bytes on $_->{Time}" / 

" for request $_->{Request}\n"; 

} 

In this program we’re using nested structures to keep track of information about a list 
of the largest transfers found so far. $largest[0] is a reference to a hash containing 
information (host, time, request, bytes) about the largest transfer seen so far, 

$ largest [1] contains information about the second-largest one seen so far, and so on. 
Whenever a new, larger transfer is encountered, the new transfer is added to the list and 
the list is resorted. 

Both of these programs run reasonably quickly - under a minute on 20MB log files on 
an older Sparc 20. 

Summary 

Perl is a powerful tool for analyzing and summarizing log files and other types of text 
databases. I’ve tried to show a few simple examples as well as some meatier ones. Of 
course, you don’t always have to construct your own analysis code from scratch. There 
are CPAN modules that will help you analyze Web and other logs, so if you have a more 
complex analysis task, be sure to check there to see whether your problem might already 
be partially or completely solved for you. 
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the dark side of regular 
expressions 

Recently, I was asked to look into a problem with a ksh script; it seemed to be 
hanging. The script looked innocuous, so I did the normal tracing thing via ksh - 
x. It seem to hang in a pipeline involving a sed command that was essentially 
doing a basename of the fourth field. The input file was all one line; the fields are 
separated by blanks. It looked like this: 

systemx=; sed lq $input 
satla cserver 



/gecko/rcv/compressed/IBMZZZZZ.YYYXXXX.IBL94.AOHOLD.G1589VOO.2030.19981119122414.mpd_emc.gz 
/gecko/rcv/cn/IBMZZZZZ.YYYXXXX.IBL94.AOHOLD.G1589VOO.2030.19981119122414.mpd_emc.gz 
mpd_emc 149266637 1181 Nov 19 12:24 


So first I verified that the command was not hanging, but just taking a very long time. This also could give a base¬ 
line to evaluate any improvements we might make. 

systemx=; sed lOOOq $input I ptime sed 's:.*/\([ A ]*\).*:\1:' > /dev/null 
real 2:17.423 
user2:13.372 
sys 0.071 


This is grisly. (The file was much larger than 1000 lines!) To reassure myself that it was a regular-expression prob¬ 
lem, and not data related, I used nawk to filter out the fourth field. 


systemx=; sed lOOOq $input I ptime nawk '{print $4}' > /dev/null 
real 0.079 
user 0.061 
sys 0.014 

As we suspected, it wasn’t data related. Lets try sed on the filtered text: 

systemx=; sed lOOOq $input I nawk '{print $4}' I ptime sed 's:.*/\([ A ]*\).*:\1:' > 

/dev/null 

real 37.815 

user 36.842 

sys 0.057 

Hmmm, a 4x improvement for processing 2.7x less data. This smells nonlinear. We can take advantage of the filter¬ 
ing to use a simpler pattern: 

systemx=; sed lOOOq $input I nawk '{print $4}' I ptime sed 's:.*/::' > /dev/null 
real 0.165 
user 0.147 
sys 0.014 

Yup, it looks like the backreferencing was the culprit all along. In general, backreferencing can take exponential 
time, but we rarely see such behavior. This time I guess we were lucky. Of course, now the pattern is so simple that 
we may as well do it all in nawk: 

systemx=; sed lOOOq $input I ptime nawk '{sub(".*/", $4); print $4}' > /dev/null 

real 0.099 
user 0.077 
sys 0.014 


Overall, the CPU (user) time is about 1700x faster, and as they say in the performance business, you eventually will 
notice factors of 1700. Ongoing, this change saved about six hours of CPU time per day. A good return on 10 min¬ 
utes of real thought. 
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By now, most folks have heard about Tel and Tk, but it seems only fair to intro¬ 
duce a new column with an introduction to the topic. 

The Tel package was developed by Dr. John Ousterhout in the late 1980s while he was 
at the University of California at Berkeley. He and his group were developing circuit 
simulators and found that each project needed a macro language to tune the system. 
After designing several on-the-fly macro languages, Dr. Ousterhout designed a package 
that could be merged into the other projects to provide a uniform language across the 
projects. 

Since then, desktop computers have grown from “fast” 33 Mhz 386 processors to “slow” 
300 Mhz Pentiums, and Tel has grown from a simple embeddable macro language into 
a multipurpose package. 

Describing the modern Tel is a lot like describing the proverbial elephant. 

If you look at Tel from one angle, its a scripting language, like Perl, awk, or sh. If you 
look at the other side of Tcl/Tk, its a GUI programming language similar to Visual 
Basic, or a multiplatform language like Java. From another angle, Tel is an interpreter 
that you can extend with your own commands (or existing libraries). Finally, Tel is a 
language toolkit that you can merge into your program. 

Just to add a bit to the indescribable nature of Tel, its commercially supported free¬ 
ware. The interpreters (including source code) are supported and made available for 
free from Scriptics <http://www.scriptics.com>). 

The core Tel distribution comes with two interpreters: tclsh, a text-based interpreter 
suitable for text-oriented programs like CGI, shell scripts, or even client-server pro¬ 
grams; and wish, the same base interpreter as tclsh extended with GUI graphics-orient¬ 
ed commands. There are also two libraries, one for Tel and one for Tk, which let you 
build your own interpreter or merge Tcl/Tk into your application. 

The ability to merge new object-code libraries into the interpreter is the feature that 
distinguishes Tel from scripting languages like sh and awk. This feature lets you merge 
a vendor-supplied library (* .dll or * .a) right into the Tel or Tk interpreter to create 
an interpreter with new task-specific commands. 

Internet pioneer Einar Stefferud sometimes explains that the hallmark of a good 
Internet protocol is that it is simple at the core, with complexities at the edges. The 
most popular protocols (SMTP, HTTP, NNTP) follow a simple query/response format, 
with the complexity living in the message content, not the protocol. 

Tel follows a similar pattern: The core Tel scripting language has a simple and regular 
syntax with a fairly small number of commands. The complex edges, in this case, are 
the extensions. The interpreter extensions have new commands that interact with a new 
object code library, while the language core stays the same. 

Using Tel makes it easy to move from problem domain to problem domain. You don't 
need to learn a whole new language. You just need to learn the new commands for that 
application. 

Several years ago I proved that a novice (me) armed with John Ousterhout s book 
could create an interpreter with a new set of commands in one evening. Now, with the 
discussions of extension building in books from Brent Welch, J. A. Zimmer, and myself, 
the learning curve may be shorter. 
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The Tel syntax is trivial: 

■ The first word on a line is a command name. 

■ Words are separated by whitespace. 

■ Words can be grouped with curly braces ({}) or quotes (""). 

■ A line can be continued across several lines by escaping the newline with a backslash. 

■ Any words in a command line after the first word are arguments to the command. 

■ A variable name preceded by a dollarsign ($) is replaced by the value of the variable. 

■ A command within square brackets is replaced by the results of evaluating that com¬ 
mand. This is similar to how back-quotes (') are handled by shell scripts. 

■ A comment starts with a “#” symbol. 

So, lets take a cursory look at some Tel commands. With just seven commands, we can 
build a GUI-based calculator. 


Tel is ... a scripting 
language ... a GUI 
programming language . . . 
an interpreter you can 
extend... a language 
toolkit. 



Assigning Values to Variables 

Probably the most used command in Tel is set. The set command assigns a value to a 
variable. 

Syntax: set variableName value 

set foo "bar"; # Assigns the string "bar" to variable foo. 

set pi 3.1415; # Assigns the value 3.1415 to the variable pi. 

set x a b; # An illegal operation, only one value can be set. 

Tel also enables you to append a new string to the value already in a variable. This is 
done with the append command. 

Syntax: append variableName value 

append foo "baz"; # Appends the string "baz" to variable foo. 
append pi 19; # Add two digits of accuracy to the previous 

# value of pi 


Performing Math 

The command to perform arithmetic operations is expr, which behaves like the 
Bourne shell expr command. Tel supports all the math calls in the standard C math 
library, including the trig and exponential functions. 

Syntax: expr algebra icExpres si on 

set twoPi [expr $pi *2]; # set the variable 

# twoPi to 2 * pi. 

set circumference [expr $twoPi * $radius]; # circumference is 

# 2*pi*radius 

set area [expr pow($radius, 2) * $pi] ; # area is pi * 

# radius A 2 

One extremely common math operation is simply incrementing or decrementing a 
variable by an integer. To make life a bit simpler, Tel has a special command for adding 
a value to a variable: incr. 

Syntax: incr variableName value 
set x 2 # Set x to 2 

incr x 4 # Add 4 to the value of x. X is now 6 

incr x -2 # Subtract 2 from the value of x. X is now 4. 
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Tk supports three layout 
managers to specify how 
your application should 
look on the screen. 


Looping 

Tel supports a loop-on-counter construct (for), a loop-on-test construct (while), and 
a loop-on-list-contents construct (foreach). 

The calculator example uses only foreach, so thats all I’ll describe here. 

Syntax: foreach variableName list { body } 

The foreach command will iterate through the values in the list. It will evaluate the 
body after setting the value of the loop variable to the appropriate list element for this 
iteration. 

# Initialize the total to 0 
set total 0; 

# For each value in the list "2 4 9" 

# add that value to the previous value of total 
foreach value {249} { 

incr total $value 

} 

GUI Widgets 

Tk supports many graphic widgets for building GUIs, including a drawable canvas, an 
editable text window, and a picture object that supports simple image operations. For 
this example, we’ll just need two widgets and a geometry manager. 

The widget creation commands all follow a common format: 

Syntax: widgetType widgetName ?arguments? 

The widgetType is the type of widget to create: button, label, canvas, etc. 

The widgetName is a name for this specific instance of the widget. The naming conven¬ 
tion for Tk widgets is that widget names must be unique and must start with a 
period/lowercase letter pair. 

The arguments enable you to specify widget configuration options like the text to dis¬ 
play, the foreground and background color, and size of margins. These are defined as 

-optionName value pairs. 

All parameters of a Tk widget can be set when the widget is created and also modified 
after a widget exists. However, unlike programming with the X library, the widgets have 
a set of good defaults, so you don’t need to define all the parameters when you create a 
widget. 

Button 

One common GUI widget is the button widget. This widget will display a string (or 
graphic) and perform an action when the button is clicked. 

Syntax: button .buttonName ?arguments? 

Two commonly used arguments are: 

-text string The text to display on the button. 

-command body The body of a command to evaluate when the button is activated. 

Label 

A label simply displays a string. One of the neat features of the Tk label is that you 
can link the label to a variable, and it will automatically display the contents of that 
variable. Your code doesn’t need to do anything to update the display. 
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Syntax: label labelName ?arguments? 

-textvariable variable Name This label will display the contents of the named 
variable. 

-text string This label will display a particular string. 

Grid 

Tk supports three layout managers to specify how your application should look on the 
screen. For the calculator example, the “grid” manager, which lays out widgets in a 
spreadsheet style, is the simplest to use. The grid command defines where a widget will 
appear and maps the widget onto the display. 

Syntax: grid widgetName ?arguments? 

-row rowNumber The row for this widget. 

-col columnNumber The column for this widget. 

-columnspan number The number of columns this widget will use. 

A Calculator 

With these seven commands, we can construct a little GUI calculator. Now, this is not 
the last word in online calculators, but it’s an example of how little code you need to 
create a useful Tcl/Tk application. Complete with comments, this is 50 lines of code. 

# Initialize a string that will contain the math operations 

# to perform 
set math "" 

# Initialize a position counter for the widgets being created, 
set pos 0 

# Loop through the numbers and operations creating buttons for 

# each widget 

foresch val {1 23456789 + 0- */ } { 

# Create a button 

# The button names are .b0, .bl, .b2, etc. 

# The text to display is the current item in the list 

# The action for the button is to append that value onto the 

# math string 

button .b$pos -text $val -command "append math $val" 

# The buttons are displayed in a 3-column-wide grid 

# Calculate the row and column 
set row [expr ($pos) / 3] 

set col [expr ($pos) % 3] 

# And map the widget to the screen 
grid .b$pos -row $row -col $col 

# Increment the position/name counter 
incr pos 

} 

# The equals button has a different command. 

# When the equals button is clicked, it will evaluate the math 

# expression, and assign the output to the result variable. 

# It then clears the math expression for the next set of 

# calculations. 

button .b_eq -text "=" -command {set result [expr $math]; 

set math ""} 

grid .b__eq -row 4 -col 2 

# Create two labels to display the math expression and result, 
label .math -textvariable math 

grid .math -row 5 -column 0 -columnspan 3 

label .result -textvariable result 

grid .result -row 6 -column 0 -columnspan 3 
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Learning More 

If you don’t already know Tcl/Tk, you are (I hope) interested in learning a bit more by 
now. Here are a few books and Web sites that will get you started. 

John Ousterhout, Tel and the Tk Toolkit. Addison-Wesley, 1994. 

The definitive book, but somewhat dated. 

Brent Welch, Practical Programming in Tel and Tk. Prentice Hall, 1997. 

An excellent book for the experienced programmer. 

Eric Foster-Johnson, Graphical Applications with Tel and Tk. IDG Books Worldwide, 
1997. 

A good introductory book. 

Clif Flynt, Tcl/Tk for Real Programmers. AP Professional, 1998. 

I think it’s a good book, but I may be biased. 

Here are some sites with general Tcl/Tk information: 

<http://www.scriptics.com> 

The Scriptics home page. Up-to-date information on the state of Tel; free source code; 
supported binary downloads; for-sale development utilities; training, support, and 
pointers to Tcl/Tk resources. 

<http://www.tclconsortium.org> 

The Tcl/Tk Consortium home page. The Tcl/Tk Consortium is a nonprofit organiza¬ 
tion of Tel advocates with a charter to make Tel known and available to the computing 
community. The Web site includes links to resources; information; and a chance to buy 
precompiled versions of Tel and Tel extensions for popular platforms. 

<http://Starbase.NeoSoft.COM/~claird/comp.lang.tcl/> 

One of the best collections of pointers to Tel “stuff,” ranging from discussions of Tel 
fine points to tutorials, books, articles and FAQs. 

And, finally, some sites with online or CAI Tcl/Tk instruction: 

<http://www.msen.com/~clif/TclTutor.html> 

The TclTutor interactive computer-based training package for Win 95/NT, UNIX, and 
Macintosh. 

<http://hegel.ittc.ukans.edu/topics/tcltk/tutorial-noplugin/index.html> 

Robert Hill, Shyamalan Pather, and Matt Peters created this 13-lesson tutorial on the Tel 
language. 

<http://www.dci.clrc.ac.uk/Publications/Cookbook/index.html> 

This is an excellent and complete tutorial by Lakshmi and Venkat Sastry. It covers Tel, 
Tk, and building extensions. 

<http://www.cujo.com/tcl_tut.html> 

William Ho (<bill@technologyarchitects.com>) has written a concise introduction to the Tel 
language. 
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electronic snake oil 


So your company has finally decided to heed the advice you gave them 11 
years ago, and they’re going to start using email to communicate with business 
partners. Or, more likely, they’ve already been doing that, but now they have 
concerns about the security of doing so. Specifically, they don't want competi¬ 
tors reading the memos that are going back and forth. Or maybe you’re sending 
email to your aunt Lena who lives in Russia and you don't think what you have 
to tell her is the business of anyone at Ft. Meade. Or your company has a num¬ 
ber of small offices around the country that you want to tie together through a 
virtual private network (VPN) over the Internet. 

In any of these cases, you have a generic need to send data to another party privately. 
You need to lock your data. You need cryptography. But how does a system administra¬ 
tor, or some other technical person who isn’t a cryptographer, know the difference 
between a good cryptography product and the stuff of Cracker Jack secret decoder 
rings? The short answer is that there isn’t any really easy answer. However, over the 
years, a few common practices have developed that have helped us identify the traits of 
those whose products are “snake oil.” Before we consider those, though, let’s cover some 
terminology and basic concepts of cryptography. 



by Matt Curtin 

Curtin is a hacker for hire, 
focusing on the areas of net¬ 
work computing, security, 
and open operating systems. 
When not cleaning his white 
hat, he likes to build houses 
out of his ever-growing col¬ 
lection of old business cards. 


<cmcurtin@interhack.net> 


Snake Oil and Silver Bullets Adapted from a Usenet Periodic Posting. 

un , « , . v , 01 r ,, , . . , See <http://www.interhack.net/people/ 

Why snake oil ? In many fields, the term is used to denote something sold without 

consideration of its quality or its ability to fulfill its vendor’s claims. This term original- cmcurtin/sna e oi aq. tm > 
ly applied to elixirs sold in traveling medicine shows. The salesmen would claim their 
elixir would cure just about any ailment that a potential customer could have. Listening 
to the claims made by some crypto vendors, “snake oil” is a surprisingly apt name. 

Basic Concepts 

A wide variety of information on cryptography is available. There is the USENET 
Cryptography FAQ, RSA’s Cryptography Today FAQ, and books such as Bruce 
Schneier’s excellent Applied Cryptography[ 1]. 

When evaluating any product, be sure to understand your needs. For data-security 
products, what are you trying to protect? Do you want a data archiver, an email plug¬ 
in, or something that encrypts online communications? Do you need to encrypt an 
entire disk or just a few files? And how secure is secure enough? Does the data need to 
be unreadable by “spies” for five minutes, one year, or 100 years? Is the spy someone’s 
kid sister, a corporation, or a government? 

Symmetric versus Asymmetric Cryptography 

There are two basic types of cryptosystems: symmetric (also known as “conventional” 
or “secret key”) and asymmetric (“public key”). 

Symmetric ciphers require both the sender and the recipient to have the same key. This 
key is used by the sender to encrypt the data and again by the recipient to decrypt the 
data. The problem here is getting the sender and recipient to share the key. 

Asymmetric ciphers are much more flexible from a key-management perspective. Each 
user has a pair of keys: a public key and a private key. Messages encrypted with one key 
can only be decrypted by the other key. The public key can be published widely, while 
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the private key is kept secret. So if Alice wishes to send Bob some secrets, she simply 
finds and verifies Bob's public key, encrypts her message with it, and mails it off to Bob. 
When Bob gets the message, he uses his private key to decrypt it. Verification of public 
keys is an important step. Failure to verify that the public key really does belong to Bob 
leaves open the possibility that Alice is using a key whose associated private key is in the 
hands of an enemy. 

Asymmetric ciphers are much slower than their symmetric counterparts. Also, key sizes 
generally must be much larger. 

Secrecy versus Integrity: What Are You Trying to Protect? 

For many users of computer-based crypto, preserving the contents of a message is as 
important as protecting its secrecy. Damage caused by tampering can often be worse 
than damage caused by disclosure. For example, it may be disquieting to discover that a 
cracker has read the contents of your funds-transfer authorization, but it’s a disaster for 
him to change the transfer destination to his own account. 

Encryption by itself does not protect a message from tampering. In fact, there are sev¬ 
eral techniques for changing the contents of an encrypted message without ever figur¬ 
ing out the encryption key. If the integrity of your messages is important, don't rely on 
secrecy alone to protect them. Check how the vendor protects messages from undetect¬ 
ed modification. 

Key Sizes 

Even if a cipher is secure against analytical attacks, it will be vulnerable to brute-force 
attacks if the key is too small. In a brute-force attack, the attacker simply tries every 
possible key until the right one is found. How long this takes depends on the size of the 
key, how computationally intensive the encryption (or decryption) process is, and the 
amount of processing power available. So when trying to secure data, you need to con¬ 
sider how long it must remain secure and how much computing power an attacker can 
use. 

Some guidelines have been offered for choosing an appropriate key length. For 
instance, Table 1 shows the cost of breaking symmetric keys by brute force, as noted by 
Blaze et al.[2]. This report strongly recommends using symmetric keys of 90 bits or 
more. 

With the tremendous increases in computing power over the last several decades, cryp¬ 
tosystems once considered secure are now vulnerable to brute-force attacks. RSA 
Laboratories sponsored a series of contests, collectively known as the 1997 Secret Key 
Challenge^]. So far, we have seen RC5 up to 56 bits fall victim to brute-force attacks, 
as well as the financial industry's workhorse, DES. At 56 bits, the keys used for DES are 
just too small to stand up to a dedicated attacker. It's noteworthy that both of the first 
two groups to break a DES-encrypted message did so with essentially no funding. The 
Electronic Frontier Foundation funded a third break, performed with a special-purpose 
DES-cracking machine, “Deep Crack,'' which did the job in 56 hours. 

If small nonprofit groups can fund development of a machine like Deep Crack, certain¬ 
ly the power that exists in larger for-profit organizations, organized crime, and govern¬ 
ment intelligence agencies can go well beyond 56 bits. 
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As mentioned earlier, asymmetric ciphers typically require significantly longer keys to 
provide the same level of security as symmetric ciphers. Comparing key lengths 
between algorithms is awkward because different algorithms have different characteris¬ 
tics. Knowing the key size is useless if you don’t know what type of algorithm is being 
used. 

But to give you some idea of whats reasonable, Table 2 [1] compares symmetric keys 
against one type of asymmetric key: those based on the 
“factoring problem” or the “discrete log problem.” 

(Algorithms based on the “elliptical curve discrete log prob¬ 
lem” are more resistant to brute-force attacks and can use 
much smaller keys. In fact, they don’t have to be much larg¬ 
er than symmetric keys, as far as is known now.) 

Keys versus Passphrases 

A “key” is not the same thing as a “passphrase” or “pass¬ 
word.” In order to resist attack, all possible keys must be 
equally probable. If some keys are more likely to be used than others, then an attacker 
can use this information to reduce the work needed to break the cipher. 

Essentially, a key must be random. However, a passphrase generally needs to be easy to 
remember, so it has significantly less randomness than its length suggests. For example, 
a 20-letter English phrase, rather than having 20 x 8 = 160 bits of randomness, only has 
about 20 x 2 = 40 bits of randomness. So, most cryptographic software will convert a 
passphrase into a key through a process called “hashing” or “key initialization.” Avoid 
cryptosystems that skip this phase by using a password directly as a key. Avoid anything 


Table 2. Key Lengths with Similar Resistance to Brute-Force Attacks 

Symmetric Key Length 

Public Key Length 

56 bits 

384 bits 

64 bits 

512 bits 

80 bit 

768 bits 

112 bits 

1792 bits 

128 bits 

2304 bits 




Table 1: Time and Cost of Key Recovery 


Type of 

Budget 

Tool 

Time and Cost per 40-bit 

Key-length Needed for 

Attacker 



Key Recovered 

Protection in Late 1995 

Pedestrian 

hacker 

Tiny 

Scavenged 
Computer Time 

1 Week 

45 


$400 

FPGA 

5 Hours ($0.08) 

50 

Small 

business 

$10,000 

FPGA 

12 Minutes ($0.08) 

55 

Corporate 

Department 


FPGA 

24 seconds ($0.08) 



$300K 

ASIC 

.005 seconds ($.001) 

60 

Big 

Company 


FPGA 

.7 seconds ($0.08) 



$10M 

ASIC 

.0005 seconds ($0,001) 

70 

Intelligence 

Agency 

$300M 

ASIC 

.0002 seconds ($0,001) 

75 
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that doesn't let you generate your own keys (e.g., the vendor sends you keys in the mail, 
or keys are embedded in the copy of the software you buy). 

Implementation Environment 

Other factors that can influence the relative security of a product are related to its envi¬ 
ronment. For example, in software-based encryption packages, is there any plaintext 
that's written to disk (perhaps in temporary files)? What about operating systems that 
have the ability to swap processes out of memory onto disk? When something to be 
encrypted has its plaintext counterpart deleted, is the extent of its deletion a standard 
removal of its name from the directory contents, or has it been written over? If it's been 
written over, how well has it been written over? Is that level of security an issue for you? 
Are you storing cryptographic keys on a multi-user machine? If so, the likelihood of 
having your keys illicitly accessed is much higher. It’s important to consider such things 
when trying to decide how secure something you implement is (or isn’t) going to be. 

Snake Oil Warning Signs 

“Trust Us, We Know What We're Doing " 

Perhaps the biggest warning sign of all is the “trust us, we know what we're doing” mes¬ 
sage that's either stated directly or implied by a vendor. If the vendor is concerned 
about the security of their system after describing exactly how it works, it is certainly 
worthless. Regardless of whether or not they tell, smart people will be able to figure it 
out. The bad guys after your secrets (especially if you are an especially attractive target, 
such as a large company, bank, etc.) are not stupid. They will figure out the flaws. If the 
vendor won’t tell you exactly and clearly what's going on inside, you can be sure that 
they’re hiding something, and that the only one to suffer as a result will be you, the 
customer. 

Technobabble 

If the vendor's description appears to be confusing nonsense, it might very well be so, 
even to an expert in the field. One sign of technobabble is a description that uses newly 
invented terms or trademarked terms without actually explaining how the system 
works. Technobabble is a good way to confuse a potential user and to mask the ven¬ 
dor’s own lack of expertise. 

And consider this: If the marketing material isn’t clear, why expect the instruction 
manual to be any better? Even the best product can be useless if it isn’t applied proper¬ 
ly. If you can’t understand what a vendor is saying, you’re probably better off finding 
something that makes more sense. 

Secret Algorithms 

Avoid software that uses secret algorithms. This is not a safe means of protecting data. 

If the vendor isn’t confident that its encryption method can withstand scrutiny, then 
you should be wary of trusting it. 

A common excuse for not disclosing an algorithm is that “hackers might try to crack 
the program’s security.” While this may be a valid concern, it should be noted that such 
“hackers” can reverse-engineer the program to see how it works anyway. This is not a 
problem if the algorithm is strong and the program is implemented properly. 

Using a well-known trusted algorithm, providing technical notes explaining the imple¬ 
mentation, and making the source code available are signs that a vendor is confident 


Perhaps the biggest 
warning sign of all is the 
“trust us, we know what 
we’re doing” message 
that's either stated directly 
or implied by a vendor. 
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about its product’s security. You can take the implementation apart and test it yourself. 
Even if the algorithm is good, a poor implementation will render a cryptography prod¬ 
uct completely useless. However, a lock that attackers can’t break even when they can 
see its internal mechanisms is a strong lock indeed. Good cryptography is exactly this 
kind of lock. 

Note that a vendor who specializes in cryptography may have a proprietary algorithm 
that it will reveal only under a nondisclosure agreement. The crypto product may be 
perfectly adequate if the vendor is reputable. (But how does a nonexpert know if a ven¬ 
dor is reputable in cryptography?) In general, you’re best off avoiding secret algorithms. 

Revolutionary Breakthroughs 

Beware of any vendor who claims to have invented a “new type of cryptography” or a 
“revolutionary breakthrough.” True breakthroughs are likely to show up in research lit¬ 
erature, and professionals in the field typically won’t trust them until after years of 
analysis, when they’re not so new anymore. 

The strength of any encryption scheme is only proven by the test of time. New crypto is 
like new pharmaceuticals, not new cars. And in some ways it’s worse: If a pharmaceuti¬ 
cal company produces bogus drugs, people will start getting sick, but if you’re using 
bogus crypto, you probably won’t have any indication that your secrets aren’t as secret 
as you think. 

Avoid software that claims to use “new paradigms” of computing such as cellular 
automata, neural nets, genetic algorithms, chaos theory, etc. Just because software uses 
a different method of computation, it isn’t necessarily more secure. (In fact, these tech¬ 
niques are the subject of ongoing cryptographic research, and nobody has published 
successful results based on their use yet.) 

Also be careful of specially modified versions of well-known algorithms. This may 
intentionally or unintentionally weaken the cipher. 

It’s important to understand the difference between a new cipher and a new product. 
Engaging in the practice of developing both ciphers and cryptographic products is a 
fine thing to do. However, to do both at the same time is foolish. Many snake-oil ven¬ 
dors brag about how they do this, despite the lack of wisdom in such activity. 

Experienced Security Experts, Rave Reviews, and Other Useless Certificates 

Beware of any product that claims it was analyzed by “experienced security experts” 
without providing references. Always look for the bibliography. Any cipher that they’re 
using should appear in a number of scholarly references. If not, it’s obviously not been 
tested well enough to prove or disprove its security. 

Don’t rely on reviews from newspapers, magazines, or television shows, since they gen¬ 
erally don’t have cryptographers to analyze software for them. (Celebrity “hackers” who 
know telephone systems are not necessarily crypto experts.) 

The fact that a vendor is a well-known company or the algorithm is patented doesn’t 
make it secure either. 

Unbreakability 

Some vendors will claim their software is “unbreakable.” This is marketing hype and a 
common sign of snake oil. No algorithm is unbreakable. Even the best algorithms are 
susceptible to brute-force attacks, though this can be impractical if the key is large 
enough. 

April 1999 ;logm: 



Some companies that claim unbreakability actually have serious reasons for saying so. 
Unfortunately, these reasons generally depend on some narrow definition of what it 
means to “break” security. For example, one-time pads (see the next section) are techni¬ 
cally unbreakable as far as secrecy goes, but only if several difficult and important con¬ 
ditions are true. Even then, they are trivially vulnerable to known plaintext attacks on 
the messages integrity. Other systems may be unbreakable only if one of the communi¬ 
cating devices (such as a laptop) isn’t stolen. So be sure to find out exactly what the 
“unbreakable” properties of the system are, and see if the more breakable parts of the 
system also provide adequate security. 

Often, less-experienced vendor representatives will roll their eyes and say, “Of course 
its not unbreakable if you do such-and-such.” The point is that the exact nature of 
“such and such” will vary from one product to another. Pick the one that best matches 
your operational needs without sacrificing your security requirements. 

One-Time Pads 

A vendor might claim the system uses a one-time-pad (OTP), which is provably 
unbreakable. Technically, the encrypted output of an OTP system is equally likely to 
decrypt to any same-size plaintext. For example, 

598v *$_+- xCtMBO 

has an equal chance of decrypting to any of these: 
the answer is yes 
the answer is no! 
you are a weenie! 

Snake-oil vendors will try to capitalize on the known strength of an OTP. But it is 
important to understand that any variation in the implementation means that it is not 
an OTP and has nowhere near the security of an OTP. 

An OTP system works by having a “pad” of random bits in the possession of both the 
sender and recipient, but absolutely no one else. Originally, paper pads were used 
before general-purpose computers came into being. The pad must be sent from one 
party to the other securely, such as in a locked briefcase handcuffed to the carrier. 

To encrypt an n-bit message, the next n bits in the pad are used as a key. After the bits 
are used from the pad, they’re destroyed and can never be used again. The bits in the 
pad cannot be generated by an algorithm or cipher. They must be truly random, using 
a real random source such as specialized hardware or radioactive decay timings. Some 
snake-oil vendors will try to dance around this issue and talk about functions they per¬ 
form on the bit stream, things they do with the bit stream versus the plaintext, or 
something similar. But this still doesn’t change the fact that anything that doesn’t use 
real random bits is not an OTP. The important part of an OTP is the source of the bits, 
not what one does with them. 

OTPs are seriously vulnerable if you ever reuse a pad. For instance, the NSA’s VENONA 
project[4], without the benefit of computer assistance, managed to decrypt a series of 
KGB messages encrypted with faulty pads. It doesn’t take much work to crack a reused 
pad. 

The real limitation to practical use of OTPs is the generation and distribution of truly 
random keys. You have to distribute at least one bit of key for every bit of data trans- 
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mitted. So OTPs are awkward for general-purpose cryptography. They’re only practical 
for extremely low-bandwidth communication channels where two parties can exchange 
pads with a method different from what they use to exchange messages. (It is rumored 
that a link from Washington, D.C., to Moscow was encrypted with an OTP.) 


Many crypto vendors claim 
their system is ",military 
grade." This is a 
meaningless term, since 
there isn’t a standard that 
defines "military grade, ” 
other than being in use by 
armed forces. 


Further, if pads are provided by a vendor, you cannot verify the quality of the pads. 
How do you know the vendor isn’t sending the same bits to everyone? Keeping a copy 
for themselves? Or selling a copy to your rivals? Also, some vendors may try to confuse 
random-session keys or initialization vectors with OTPs. 


Algorithm or Product X is Insecure 

Be wary of anything that claims that competing algorithms or products are insecure, 
without providing evidence for these claims. Sometimes attacks are theoretical or 
impractical, requiring special circumstances or massive computing power over many 
years, and it’s easy to confuse a layman by mentioning these. 

Recoverable Keys 



If there is a key-backup or key-escrow system, are you in control of the backup or does 
someone else hold a copy of the key? Can a third party recover your key without much 
trouble? Remember, you have no security against someone who has your key. 

If the vendor claims it can recover lost keys without using some type of key-escrow ser¬ 
vice, avoid it. The security is obviously flawed. 

Exportable from the US 

If the software is made in the US, can it be exported? Strong cryptography is considered 
dangerous munitions by the United States and requires approval from the US Bureau 
of Export Administration, under the US Department of Commerce, before it can leave 
the country. Various interested government agencies serve as consultants to the Bureau 
of Export Administration when evaluating such requests. (The US isn’t alone in this; 
some other nations have similar export restrictions on strong cryptography.) Chances 
are, if the software has been approved for export, the algorithm is weak or crackable. 

If a vendor is unaware of export restrictions, avoid its software. For example, if it claims 
that the IDEA cipher can be exported, when most vendors (and the US Government!) 
do not make such a claim, then the vendor is probably lacking sufficient clue to provide 
you with good cryptography. 

Because of export restrictions, some decent crypto products come in two flavors: US- 
only and exportable. The exportable version will be crippled, probably by using smaller 
keys, making it easy to crack. 

There are no restrictions on importing crypto products into the US, so a non-US ven¬ 
dor can legally offer a single, secure version of a product for the entire world. 

Note that a cryptosystem may not be exportable from the US even if it is available out¬ 
side the US. Sometimes a utility is illegally exported and posted on an overseas site. 

"Military Grade" 

Many crypto vendors claim their system is “military grade.” This is a meaningless term, 
since there isn’t a standard that defines “military grade,” other than being in use by 
armed forces. Since these organizations don’t reveal what crypto they use, it isn’t possi¬ 
ble to prove or disprove that something is “military grade.” 
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Unfortunately, some good crypto products also use this term. Watch for this in combi¬ 
nation with other snake-oil indicators, such as “our military-grade encryption system is 
exportable from the US!” 

Other Considerations 

Avoid vendors who don’t seem to understand anything described in the “Basic 
Concepts” section above. 

Avoid anything that allows someone with your copy of the software to access files, data, 
etc., without needing some sort of key or passphrase. 

Beware of products that are designed for a specific task, such as data archiving, and 
have encryption as an additional feature. Typically, it’s better to use an encryption utili¬ 
ty for encryption, rather than some tool designed for another purpose that adds 
encryption as an afterthought. 

No product is secure if used improperly. You can be the weakest link in the chain if you 
use a product carelessly. Do not trust any product to be foolproof, and be wary of any 
product that claims it is. 

Interface isn’t everything; user-friendliness is important, but be wary of anything that 
puts too much emphasis on ease of use without due consideration to cryptographic 
strength. 

Glossary 

algorithm A procedure or mathematical formula. Cryptographic algorithms convert 
plaintext to and from ciphertext. 

cipher Synonym for “cryptographic algorithm” 

escrow A third party able to decrypt messages sent from one person to another. 
Although this term is often used in connection with the US Government’s “Clipper” 
proposals, it isn’t limited to government-mandated ability to access encrypted informa¬ 
tion at will. Some corporations might wish to have their employees use cryptosystems 
with escrow features when conducting the company’s business, so the information can 
be retrieved should the employee be unable to unlock it himself later (if he were to for¬ 
get his passphrase, suddenly quit, get run over by a bus, etc.) Or, someone might wish 
her spouse or lawyer to be able to recover encrypted data, etc., in which case she could 
use a cryptosystem with an escrow feature. 

initialization vector One of the problems with encrypting such things as files in spe¬ 
cific formats (i.e., that of a word processor, email, etc.) is that there is a high degree of 
predictability about the first bytes of the message. This could be used to break the 
encrypted message more easily than by brute force. In ciphers where one block of data 
is used to influence the ciphertext of the next (such as CBC), a random block of data is 
encrypted and used as the first block of the encrypted message, resulting in a less pre¬ 
dictable ciphertext message. This random block is known as the initialization vector. 
The decryption process also performs the function of removing the first block, result¬ 
ing in the original plaintext. 

key A piece of data that, when fed to an algorithm along with ciphertext, will yield 
plaintext (or, when fed to an algorithm along with plaintext, will yield ciphertext). 
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a management 
perspective on 
privileged access to 
computer systems 

Many computer systems define some kind of privileged access, to be sure that 
certain sensitive functions are protected and to allow system staff to override 
security and other protections in the course of doing their jobs. (UNIX has a 
"superuser” called root, which has total control of the system; VAX/VMS has 
the bypass bit; Windows NT has the Administrator user. Other systems have 
similar things, by various names.) Yet the very term "privileged access” pre¬ 
sumes the existence of unprivileged access, the kind granted to most regular, 
authorized users, which some of them find demeaning and overly restrictive. 

The determination of who gets privileged access is usually made by the systems 
staff. It should be made fairly and reasonably, according to policies that apply 
to everyone. But sometimes conflicts arise, and management must be prepared 
to resolve them. 

The Question 

Users who need to do something on the computer requiring privileged access they 
don’t have often pose the question, “Can I have the root password?” The first request is 



<jxh@jxh.com> 


-\ 

by Jim Hickstein 

Jim Hickstein was a pro¬ 
grammer for many years, but 
got into UNIX system admin¬ 
istration "in self-defense" in 
1990. He has been the man¬ 
ager of a group of sysadmins 
almost as long. He is the 
Treasurer of BayLISA. 


J 


April 1999 ;login: 


39 








There are usually ways to 
accomplish whatever the 
user must do short of 
granting unlimited 
privileged access. . . . But 
sometimes the true reason 
is, "I need it to protect my 
position " or “/ need it to 
enhance my self-esteem." 


seldom granted without deeper probing; the systems staff asks, “Why? What for?” and 
the user has one of several typical responses. The commonest, and easiest to handle, is 
“I need it to do A, B, and CThere are usually ways to accomplish whatever the user 
must do short of granting unlimited privileged access, and the systems staff can direct 
the user to those tools or methods, up to and including doing something manually on 
the user’s behalf. 

But sometimes the true reason is, “I need it to protect my position” or “I need it to 
enhance my self-esteem.” Getting to the truth of such reasons can be difficult and the 
source of much of the conflict that can arise. In these cases, it is vital that management 
get involved to resolve the issue without compromising the organization’s goals. 
Managers must understand the principles involved to make the right decision. 

The Stability Argument 

Computer systems are very complex, the more so as they become more flexible and 
powerful, and therefore more valuable to their owners. Even a single-user personal 
computer is complex enough to demand a great deal of its users time and effort. 
Consider the typical workplace computing environment, which is a network of dozens, 
sometimes thousands, of computers, all interacting in various ways. There’s a lot going 
on. 


Software developers, when they create these systems, have many choices to make about 
how the systems will operate. But to reach the largest market, they leave many of those 
choices up to the customer, calling their systems “flexible” and “policy-neutral.” And 
indeed they are, but someone has to flex them, and someone has to set the policies. 

And someone has to tell the software what the policies are, to configure things so they 
interact correctly. 

Such configuration is invariably central to the correct operation of the system as a 
whole. There is a high risk of a mistake having widespread effects. System configuration 
is typically protected, requiring privileged access to change things, so (the developers 
imagine) only those who know what they are doing will change anything. The system 
will run smoothly, according to the developers’ vision, and the users will be productive 
and happy. 

The stability argument says that most users cannot know all the details of how the sys¬ 
tem is configured and implemented, so they cannot always make informed choices 
about what to change. A little knowledge is a dangerous thing. Even a lot of knowledge 
may not be enough to avoid making a serious mistake. And granting unlimited privi¬ 
leged access gives the user the ability, if not the motive, to make such changes. 

Often, the mistakes happen far away from their effects. In one example, users were hav¬ 
ing trouble logging in “sometimes.” Their accounts seemed to go in and out of exis¬ 
tence in the space of hours. Some days the problem got worse, some days better. After 
almost a week, the senior sysadmin tracked the problem to a transient machine set up 
as an NIS server, brought to the headquarters site by visitors from a field office in 
Israel. Clients find an NIS server by broadcasting on the network, and sometimes they 
found this bogus one instead of the authorized ones run by the local staff. The NIS 
domain name was the same, a fact that caused no trouble back in the field office 
because broadcasts did not reach across the wide-area network. The mistake had its 
roots months in the past and a continent away. Privileged access to that one machine, 
and its later transport, was all that was needed. 
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In another example, a remote office (this one in Paris) was configured to have an incor¬ 
rect so-called default route, normally directing network packets toward the corporate 
headquarters and its Internet gateway. But this machine was also configured to run a 
dynamic routing process, which told all the other computers on the network this route, 
sucking the entire company’s Internet-bound packet traffic into a black hole for about 
24 hours. 


These are simple examples with fairly easy corrective actions (such as restricting where 
core routers get their default route). But more examples are easily found of “accidents” 
with profoundly obscure mechanisms and highly non-local effects. The more complex 
the system or network, the worse this gets, and the harder it is for even the specialists to 
avoid making disastrous mistakes. 


In these scenarios, not having root privileges means not having a sword hanging over 
ones head. Privilege carries responsibility: privilege one may not need, and responsibil¬ 
ity one can probably do without. If there’s a way to avoid it, take that way. 


The Productivity Argument 

Computers are supposed to be tools. Some people collect and enjoy tools for their own 
sake, but a carpenter carries a hammer to pound in nails and build something, not just 
because it looks good on the belt. 

There are two kinds of computer users: those who do computing for its own sake (sys¬ 
tem staff and programmers); and those who use the computers to accomplish some¬ 
thing unrelated to the computers they use, something you are paying them to do. The 
more time they spend in overhead, getting the tools to work correctly, the less time they 
spend ... not “doing work” exactly, but getting their work done. 

With the complexity of the tools comes specialization. Even in an office where “people 
administer their own desktops” (a codephrase for disaster in this context), an informal 
leader will often emerge, the power user to whom the others turn when they have a 
problem. (Who helped you when had trouble upgrading your laptop? That’s the one.) 
After a while, that person will obtain the secret passwords to the file server or router or 
whatever, and, if you’re lucky, will keep them secret. You now have a sysadmin. 

But what was that person’s “real” job, again? It’s getting worked on, to be sure, but is it 
getting done? Are you happy with that person’s performance in that respect? 

If you’re less lucky, the informal leader will recognize this trap and avoid helping the 
others (and taking the fall at review time), and the rest of them will founder. Where’s 
the manual for A? How did you get B to print? C doesn’t work at all, so I (faxed it / 
wrote it to a floppy / bought a box of pencils / lost the order). 

Specialization requires time and energy. Not everyone has, or should spend, the time to 
read all the manuals and figure out the systems and solve the system problems. But 
someone should. 

One person who is knowledgeable about the systems and can set them up so they run 
well provides leverage for the productivity of all the other people there. Even in 
extremely small organizations, where a whole person can’t be justified, part of one 
should be officially recognized and assigned this role and the resources to carry it out. 
To let this function go unstaffed is to waste a great deal of the power in the systems, to 
waste much of the large investment in technology that most businesses now make. And, 
often much bigger, to waste expensive labor, misdirected into fooling with the comput¬ 
ers while not getting the right value out of them. 
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How long would you stay 
in business If your 
employees could give 
themselves a raise by 
modifying the payroll 
database without 
authorization? Yet someone 
has to be able to modify it. 
That’s the crux of security. 


In larger shops, where several people are dedicated to systems support, the problem is 
slightly different: There get to be too many specialists, too many cooks spoiling the 
broth. In the small shop, the support person will often have unofficial deputies who 
have privileged access and make limited changes, but communication is easy: “While 
you were on vacation, I rebooted D twice, and had to remove lock file E .” 

In larger shops, it becomes ever more important to limit the absolute number of people 
with privileged access. Even informal deputies (former sysadmins, senior systems devel¬ 
opers) become first unnecessary, and then undesirable; they have to be deputized for¬ 
mally. Change-control procedures come to the fore, and documentation is vital. In 
these situations, merely knowing about the procedures and how to follow them consti¬ 
tutes special knowledge that very few part-time helpers can be expected to maintain. 
And the folklore that somehow never gets documented underscores this. The answer to 
The Question becomes “Thank you for your offer of help, but we’ve got things under 
control.” Its often not entirely true. 

The Security Argument 

Another seldom-admitted-to reason behind a request for more privileges is, “Because 
I feel I deserve to be trusted with this.” There is a fallacy behind this, to wit, that 
regular, authorized users aren’t trusted already. Virtually all information security sys¬ 
tems depend absolutely on their users to cooperate with, if not enforce, the rules. All 
such systems can be thwarted by authorized users through malice, negligence, or just 
ignorance. 

But of course there are degrees. If one has privileged access, one can bypass user identi¬ 
fication. That is, one can masquerade not only as the “super” user, but as any other 
authorized user. One could then do bad things that would be ascribed to the victim. 
People have lost their jobs, even their freedom, from such acts. Those with this access 
are perforce trusted not to abuse it this way. 

Security threats fall into three broad categories: access, disclosure, and modification. 
Unauthorized access is akin to breaking and entering (still a crime, even if the would- 
be burglar doesn’t take anything). Denial-of-service attacks are in this group, since they 
deny access to authorized users. With disclosure, files (say) don’t have to be destroyed 
to do the damage: Someone sends your email about that top-secret unannounced 
merger to the Wall Street Journal. Modification (especially if undetected) and its limit¬ 
ing case, destruction, are another group. How long would you stay in business if your 
employees could give themselves a raise by modifying the payroll database without 
authorization? Yet someone has to be able to modify it. That’s the crux of security. 

Most computer systems feature some kind of user identification, at least as part of 
authorization. (What you can do depends on who you are.) Some don’t even do that; 
personal-computer operating systems are only now starting to get these features. Yet, 
with the concept of the all-powerful privileged user who can bypass everything, user 
identification isn’t perfectly reliable. It is meaningful only insofar as you control tightly 
who can bypass it. 

“My people administer their own desktops, and everyone knows the root password to 
the fileserver. As long as we all know this, and agree that there is no security, what’s the 
problem?” The problem here is that you have discarded this feature of the operating 
system, and these hosts cannot be trusted with certain information as a result - infor¬ 
mation that people need to store and use in a trusted environment. Do you want to 
conduct all your secret communications about that merger only in person? What would 
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the airfare be like for those last-minute meetings? Okay, you need to trust the tele¬ 
phone. What about the fax machine? The network printer? The computer on your 
desk? Can you? Are you sure? What if you turn out to be uninformed about an impor¬ 
tant risk? What if you learn this from the Wall Street Journal ? 

The sysadmins aren’t supposed to be aware of the merger yet, but they can read your 
email (if they have the time and inclination to look, which is unlikely). But if there are 
only three of them, for sure, and the fact leaks from your office, the investigators will 
thank you for keeping the number small. Interviewing 40 people in this situation, six of 
whom aren’t even employees, is a lot harder, they will tell you. You can’t drive this 
number to zero, with the technology we’ve got today, but you need to keep it small. 

The Economic Argument 

In fact, all of the previous arguments boil down to this one. Stability and reliability 
contribute to productivity and therefore profits. Security is just the limiting of potential 
losses, tangible and intangible, but virtually all financial at bottom, at least in a com¬ 
mercial enterprise. 

This is the weakest argument against a user demanding privileged access. But it under¬ 
lies all the policies, all the procedures. Management should bear it in mind but not beat 
people up with it. 

Deputies 

Having said all that, there are some cases where extending privileged access to a few 
people outside the systems group will actually further all these goals. Former sysad¬ 
mins, system programmers, and very knowledgeable users will often request such access 
when they find that their own productivity is impaired by waiting for systems staff 
(often overworked) to do apparently simple tasks as root. But the manager must decide 
if total productivity, stability, or security will be improved (or at least not damaged), 
rather than just the life of one user. It’s seldom an easy call. 

Sysadmin groups usually develop some kind of processes that help them act as a team 
and avoid mistakes. The simplest is change control - to be able to roll back erroneous 
edits to system files and to track who did what. Further development brings formal 
communication and work tracking (mailing lists, ticket systems), and later, formal proj¬ 
ect planning, scheduling, and review; still later, metrics of the work processes them¬ 
selves. 

When these processes exist, helpers must be deputized by being brought into the loop 
and made to follow the standard procedures. Some effort is involved in training them 
for this, which can be the source of sysadmin resistance to deputizing someone. 
Sometimes, the request to become an official deputy sysadmin only serves to point out 
a glaring lack of such processes. Managers should recognize this situation, take steps to 
create (or just write down) the processes, and then to see that the prospective deputy is 
properly trained. 

If helpers are not formally deputized, but just given the passwords without a second 
thought, they may end up being a source of grief for the sysadmins and the other users, 
and access may have to be revoked: clearly a failure of management. But if this is done 
right, a good deputy will be recognized as an ally to the sysadmins, and a resource to all 
the users. 
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the webmaster 

Password-Protecting Areas of Your Web Site 

Recently I was talking with a client about restructuring their Web site to make 
quite a bit of the material off-limits to the random public, areas with sensitive 
information that only their partners would be able to explore. The obvious solu¬ 
tion was to employ a password-protection scheme, but there are a number of 
ways to approach this, and I thought it'd be interesting to share my thinking 
and results with the ; login: crowd. 

Password Protection, Take One 

The first and perhaps most obvious way to do this is to replace a page of information 
with a login FORM on a Web page, then feed what the user entered to a CGI script that 
compares the password against the official password for that area of the site. 

The HTML for the form might look like: 

<FORM ACTION=login.cgi METHOD=post> 

Enter Site Password: 

<INPUT TYPE=password NAME=pw SIZE=20> 

<INPUT TYPE=submit VALUE="log in"> 

</FORM> 

Then, if we’re going to utilize a Perl solution and have something like the simple cgi- 
lib.pl library from Steven Brenner (you can get a copy for yourself from your local 
CPAN archive), the script underlying this is as simple as: 

#!/usr/bin/perl 

push (@INC, '/cgi-bin'); 

require ('cgi-lib.pi'); 

$official = "unix"; # official area password 

&ReadParse; # read and parse arguments 

$pass="$in{'pw'}"; 

if ( $pass eq $official ) { 

print "Location: logged-in.html\n\n"; 

} 

else { 

print "Content-type: text/html\n\n"; 
print "Password Failed."; 

} 

exit 0; 

Enter the correct password (“unix”, as defined in the fourth line of the Perl script) and 
you’re in: The next page you’ll see is “logged-in.html”. 

This first version works, but it’s pretty darn rudimentary and has some drawbacks, not 
the least of which is that everyone has the same password. Obviously, without account 
names to go with it, this is security only in the most minimal of senses. 

There’s another problem here too, one that’s a bit more subtle: Once you’ve logged in, 
the URLs you’re seeing in your Web browser are post-login URLs. If you were to send, 
say, the real-life URL of the page we’re talking about here - <http://www.intuitive.com/CGI/ 
password/logged-in.html> - to your friends, they wouldn’t even need to worry about the 
password because they’d have effectively skipped right past it. 

You’d be surprised how many sites use this kind of mechanism for password protec¬ 
tion. One password for everyone just isn’t very good. Having a password facade can be 
good for some cases, but isn’t very secure. 
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A Second Try 

Before we leave this area, however, an example of where this kind of login might be 
useful is when you have individual accounts and want to display different information 
based on the type of account. In this case, the wrinkle is that you need to ask for both 
an account and password pair on the FORM (a trivial change), extract both in the CGI 
script, and then compare them against a file of defined account/password pairs. In this 
case, let s create a file “pwfile” that contains two lines 

taylor:unix:partner 
guest:guest:guest 



that define the account name, password, and level of access granted. 


The second version of the Perl login script is a bit more complex. First, the subroutine 
that does all the work, reading and parsing the account file: 

sub Matches 

{ 

local ($given_name, $given_pass) = @_; 
open(PASSWORDS,"pwfile") or die "can't open pwfile"; 
while ($line = <PASSWORDS>) { 
chomp($line); 

($name / $pass, $accesslevel) = split(":", $line); 
if ( $name eq $given_name) { 

if ( $pass eq $given__pass) { #success! 

return $accesslevel; 

} else { 

return nil; # wrong pw 


} 


close(PASSWORDS); 
return nil; 


# no match on acct 


} 


Here’s where Perl is a winner: The split () routine automatically breaks up the line of 
information at the V separator, returning each of the three values into its own 
mnemonic variable. Then the conditional tests are easy to code. 

The main program needs to be modified to take advantage of the level of access that we 
can now grant: 

#!/usr/bin/perl 

push (@INC, '/cgi-bin'); 

require ('cgi-lib.pl'); 

&ReadParse; # read and parse arguments 

$name="$in{'name'}"; 

$pass="$in{'pw'}"? 

if (($access = &Matches($name, $pass)) ne nil) { 
print "Location: $access.html\n\n"; 

} 

else { 

print "Content-type: text/html\n\n"; 
print "Account/Password pair failed.\n"; 

} 

exit 0; 

Do you see what’s happening here? If user “taylor” logs in successfully, he’ll be dropped 
onto the Web page “partner.html,” whereas if the guest user logs in, she’ll start out on 
the “guest.html” page. 


April 1999 ;login: 


45 


FEATURES 




With the passwords 
encrypted, it's a lot harder 
for hackers to reverse- 
engineer and sneak in if 
they manage to snag a 
copy of this information! 

#!/usr/bin/perl 

print " \nMake htpasswd Account Entry...\n\n"; 

print "User name : 

chomp($user = <STDIN>); 

print "Password : 

chomp($passwd = <STDIN>); 

srand($$I time); 

@saltchars=(a..z,A..Z,0..9,; 

$salt=$saltchars[int(rand($#saltchars+l))]; 
$salt.=$saltchars[int(rand($#saltchars+l))]; 

$passwdcrypt = crypt($passwd,$salt); 

print "\nAdd the following to the htpasswd file:\n\n"; 
print "\t$user:$passwdcrypt\n\n"; 
exit 0; 

As you can see, it does the work of encrypting the password and then displays exactly 
the information you’ll need to add to the new .htpasswd file. A file duplicating the two 
accounts shown above would look like this: 

taylor:z6K5hUINhVrIA 
guest:SXUISuZuiDOOs 

With the passwords encrypted, it’s a lot harder for hackers to reverse-engineer and 
sneak in if they manage to snag a copy of this information! 

The only other step is to change the httpd.conf file so that the server knows to look 
for the password file in the directory. This is done by adding the boldface lines below to 
the file: 

cVirtualHost www.intuitive.com> 

ServerAdmin webmaster@intuitive.com 
Document Root /web 
ServerName www.intuitive.com 

cLocation /CGI/password/private> 

AuthName /web/CGI/password/private 
AuthType basic 

AuthUserFile /web/CGI/password/private/.htpasswd 
Require valid-user 
Allow From All 
</Location> 

</VirtualHost> 

Now we’re rocking! Any access to any of the information in the specified folder 
(AuthName) requires the user to log in to the server correctly, with the dialog box (as 


This is a cool way to control access to a site and simultaneously offer multiple levels of 
access. It still suffers from some of the limitations of the earlier solution, of course. If 
you see “guest.html,” you might well guess “partner.html” was another possible Web 
page, and poof, you’re in! 

A Third Solution: Let the Server Do the Work 

Yet another solution, one that offers more security, is to let the Apache Web Server do 
the work through the .htpasswd facility. In essence, you create a simple password file 
containing names and encrypted passwords, then enable your Web server to look for 
the file. Once set up, any access to any files within the protected folder must automati¬ 
cally be validated by forcing the user to enter a login/password pair. 

To create the password file, I use a simple Perl script that I cobbled together called 

makepasswd: 
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Enter username for /web/intuitive.com/CGI/ 
passvord/priYate at vw.intuitive.com: 


Name: 


Password: 


shown in Figure 1) popped up and the 
account/password information compared 
against the contents of the AuthUserFile as 
shown above. 

This is very cool and professional looking, 
being able to hide your information behind 
this kind of password protection. The down¬ 
side is that you don’t get much control over the 
presentation and appearance of the box, 
whereas in the previous approaches you could 
build a login page that looked very consistent 
with the rest of the site’s appearance. 

Merging These Together 

Unfortunately, there’s no easy way to include 
additional fields in the htpasswd file data 
(which would be ideal), so instead you’re stuck 

having either to give out different password-protected URLs for different classes of 
users (for example, yourhost.com/partners/ and yourhost.com/guest/) or try a hybrid 
solution. 


Enter Your Name And Password 


Cancel 


OK 


Figure 1 


Let’s talk about the latter for a sec. It turns out that once you’ve logged in to a Web 
server with the .htpasswd-prompted solution, for the duration of that session you 
now have an additional environment variable that you’re carrying around with you: 
REMOTE_USER . It’ll contain the name half of the name/password information required 
to log in. 

With that in your toolbox, you could then have an index.cgi script, for example, that 
looks up the user in a second access-level file (keyed on the REMOTE_USER information), 
then presents a page based on that information. It’s a simple subset of what we’ve 
already seen; we don’t even need to worry about any CGI argument parsing. 

The first part is the replacement for the match routine: 

sub AccessLevel 
{ 

local ($given_name) = @_; 

open(PASSWORDS,"../pwfile") or die "can't open pwfile"; 
while ($line = <PASSWORDS>) { 
chomp($line) ; 

($name, $pass, $accesslevel) = split(":", $line); 
if ( $name eq $given_name) { 
return $accesslevel; 

} 

> 

close(PASSWORDS); 

return "guest"; # default access 

} 

If there isn’t a match in the file, it returns “guest” as the access level. Notice that I’m 
using the same file from the previous examples; it’s living one level up on the filesystem 
(.. /pwf ile) but otherwise it’s as you’ve already seen. 
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Finally, heres the simple snippet that's the heart of the switch CGI script: 

$name=$ENV{"REMOTE_USER"}; 

$access=&AccessLevel($name); 
print "Location: $access.html\n\n"; 

The variable REMOTE_USER contains the login name of the person who successfully 
signed in to the restricted area. If I just signed in as “taylor” (with the password “unix”) 
then remote_user would be set to “taylor” automatically. 

Summary 

There's no perfect, graceful solution to password-protecting an area of a Web site with 
complete control, but this does give you a good idea of the different types of solutions 
and their trade-offs. 

You can try out all these different solutions online and experience them for yourself: 
<http://www.intuitive.com/CGI/password/>. 
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In a previous column we looked at performance issues with Java I/O. Another 
aspect of I/O that needs to be mentioned is the cost of data formatting. 

Consider first a couple of C examples, ones that output lines of the form: 

The maximum weight is 100 lbs. 

The first example simply prints this string repetitively: 

include <stdio.h> 

int main() 

{ 

const long N = 100000L; 
long i; 

for (i = 1; i <= N; i + + ) 
printfC'The maximum weight is 100 lbs.\n"); 

return 0; 

} 

The second uses printf ( ) to format the weight and units values: 

#include <stdio.h> 

int main() 

{ 

const long N = 100000L; 
long i; 

int w = 100; 
char* u = "lbs."; 
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for (i = 1; i <= N; i++) 
printfC’The maximum weight is %d %s\n", w, u); 

return 0; 

} 

The second program runs around 25 percent slower than the first because of the for¬ 
matting overhead. Formatting is clearly a useful feature, but it’s worth knowing what 
the costs are. 


With Java, a similar issue arises. To determine how expensive formatting is, we can 
write a series of programs. The first prints a string over and over: 

public class formatl { 

public static void main(String args[]) 

{ 

final int N = 25000; 


for (int i = 1; i <= N; i++) { 

String s = "The maximum weight is 100 lbs.\n"; 
Systern.out.print(s); 

} 

} 

} 


The second uses the string concatenation operator (+) to format values of variables: 

public class format2 { 

public static void main(String args[]) 

{ 

final int N = 25000; 


int w = 100; 

String u = "lbs."; 

for (int i = 1; i <= N; i++) { 

String s = "The maximum weight is " + w + 

" " + u + " \n"; 

Sys tern.out.print(s); 

} 

} 

} 

This code is slower than the first example because of the extra costs of converting inte¬ 
ger values like “w” to strings, and because of costs in concatenating strings. 

The third example uses the Text.MessageFormat facility: 
import java.text.*; 

public class format3 { 

public static void main(String args[]) 

{ 

final int N = 25000; 

MessageFormat f = new 

MessageFormat ("The maximum weight is {0} {l}\n“); 

int w = 100; 

String u = "lbs."; 

Object vals[] = new Object[2] ; 
vals[0] = new Integer (w); 
vals[l] = u; 

for (int i = 1; i <= N; i++) { 

String s = f.format(vals); 
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A format object of type 
MessageFormat is created 
and then used. . . . This 
approach is desirable if 
you're trying to write Java 
applications that work in 
international contexts. 


System.out .print (s) ; 

} 

} 

} 

The idea here is that a format object of type MessageFormat is created and then used. 
Placeholders like “{0}” in the format are replaced with object values passed to the 
format () method. This approach is desirable if you re trying to write Java applications 
that work in international contexts; the message format can be read from a resource 
bundle (a group of resources that is keyed off of a locale) and then applied. 

The final example is similar to the previous one, but it creates a message format on the 
fly: 

import j ava.text.*; 

public class format4 { 

public static void main(String args[]) 

{ 

final int N = 25000; 

String f = "The maximum weight is {0} {l}\n"; 

int w = 100; 

String u = "lbs."; 

Object vals[] = new Object[2]; 
vals[0] = new Integer(w); 
vals[l] = u; 

for (int i = 1; i <= N; i++) { 

String s = MessageFormat.format(f, vals); 

System.out.print(s); 

} 

} 

} 

In the previous example the message format was created once and then applied repeti¬ 
tively, allowing for some optimizations to be done. In this example, the format is passed 
in as a raw string each time. This approach is simpler but slower. 

The programs produce identical output. The running times are: 


format 1 

1.3 (seconds) 

format2 

1.9 

format3 

5.5 

formats 

8.0 


If you’re trying to tune I/O performance in a Java application, the area of data format¬ 
ting may be worth looking at. 
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using |ava 

Applets and Animation 


A popular of use of Java since its inception has been the creation of applets. 
Applets are the “other” way of running Java programs; they do not have a main 
method because the browser inside which they run has one instead. Applets are 
also special in the sense that they are “panels,” whereas Java programs that 
are not applets do not extend java.applet.Applet. 

This article focuses on one of the more interesting and popular aspects of writing 
applets - animation. This typically means taking an object (text and/or graphics) and 
moving it on the screen. We will write a Java program that displays a “banner” in a 
browser. The code will include some aspects of programming with the AWT, and Til 
provide pointers on how to write applets. This applet also demonstrates the use of 
“threads” to do animation. 
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The AWT and the JDK1.1 Event Model 

Lets start with some fundamentals about applets. (This is a vast topic, so I won’t treat 
it very rigorously here.) 

The graphics model is one of the key elements of the Abstract Windowing Toolkit 
(AWT), and applets are essentially panels that are part of the AWT. Applets are often 
used in interactive applications that run inside browsers, and so they rely on a powerful 
event model in the JDK to support the interactive capability. The event model is also 
basic to building GUIs (another potential use of applets). 

The event model has changed considerably since JDK 1.0. Although I know of no appli¬ 
cations currently running under JDK 1.0, they may still exist. The JDK1.0 event model 
was a “containment” model in which the events passed through the entire component 
hierarchy, and the programmer could control which components handled the event. 
The JDK 1.1 event model is known as a “delegation” model. In this model, event sources 
and listeners are created, and components register themselves with the various event 
listeners. The event listeners handle the events generated from registered components. 
We will not deal with the containment model from JDK 1.0 because the subsequent 
releases from JDK1.1 onward use the delegation model. 

Applets 

Applets are “panels”: The applet window has to be capable of containing other compo¬ 
nents such as text areas, buttons, and lists. The java.awt.Panel is the simplest class 
that can do this and be a top-level window. 

One useful way to look at an applet is that it is defined by its context, such as a Web 
browser or appletviewer. Managing an applet means that the user must override meth¬ 
ods such as when to start and what to do when another Web page is visited. The user 
must override these methods because java.applet .Applet defines them with empty 
bodies. These are the only methods that the user is permitted to override. However, the 
user is permitted to override those that Panel permits. 

Example 

This applet is an example of the use of a thread to do animation. The animation con¬ 
sists of moving some text across a background. This will also serve to introduce some 
new elements of the AWT, which I’ll discuss as we come across them. 

Let’s start with the animation thread: 

import java.awt.*; 
import java.applet.*; 
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If the thread is runnable, 
then set it to null. The 
Java virtual machine will 
eventually notice that the 
thread created has no 
references to it and will 
flush it out of the system. 


Import all relevant packages. 

public class Animation extends Applet 
Animation is a subclass of applet and so is an applet itself, 
private Banner b; 

Banner is the class that actually does the animation. We’ll study that after we have 
looked at an applet that uses it. 

private Thread animationThread; 

The thread that does the animation. 

public void init() 

The init method that all applets must define. 

resize(Integer.parselnt(getParameter("width”)), 

Integer.parselnt(getParameter("height”))); 

Read the parameters by the HTML file and set the size of the applet. 

b = new Banner("I Love Austin"); 

Create an instance of the banner class. The argument is an instance of String. This is 
the string that will move across the screen. 

add(b); 

This is a new method. We will look at it in more detail later, but the effect of this is to 
tell the AWT system that b is to be contained within the applet. Remember that the 
class java, applet. Applet is a subclass of Panel, and panels can contain other AWT 
components. 

public void start() 

The start method. 

if (animationThread == null) 

{ 

animationThread = new Thread(b); 
animationThread.start(); 

} 

If no thread has been created, create it with an instance of Runnable as target, b is also 
an AWT component and is runnable. 

else if (animationThread.isAlive()) 
animationThread.resume(); 

If it has been created and is runnable, then continue it from where it was, not from the 
start. 

public void stop() 

The stop method. 

if (animationThread 1= null && animationThread.isAlive()) 
animationThread = null 

If the thread is runnable, then set it to null. The Java virtual machine will eventually 
notice that the thread created has no references to it and will flush it out of the system. 

public void paint(Graphics g) 

{ 

super.paint(g); 

} 

Now let’s introduce the Banner class: 

class Banner extends Component implements Runnable 
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Notice that Banner is a subclass of Component (which is an AWT component) because 
it extends Component, and it is runnable because it implements the interface Runnable. 
As you will remember, this means that instances of the class Banner can be supplied as 
targets when threads are created. This is exactly what we did in the applet previously. 

Now for the various fields of the class Banner. 

private String bannerstring; 

This is the banner that is dragged across the box. 

private int boxw, boxh; 

The size of the box across which the banner is dragged. 

private Color fgColor; 

The color of the font used for drawing the banner, 
private Color bgColor; 

The background color of the box on which the banner string will be drawn 
private Font bannerFont = new Font("Helvetica", Font.PLAIN, 36); 

The font used for the banner string. We made it large ... just like in real life, 
private int[] X, Y; 

An array of positions that are the starting points for drawing the banner string. The 
banner is first drawn at the coordinate (X[0], Y[0]), then it is moved by drawing at 
location (X[ 1 ], Y[l]), then at (X[2], Y[2]), and so on. 

private int bannerx, bannery; 

The location at which the banner is currently drawn. This will move one of the mem¬ 
bers of the array X, Y. 

Now that we have the Banner class we need to do a few more things. We must provide 
a body to the run method. Remember that run is part of the interface Runnable, and 
any class implementing Runnable must also provide a body for the run method. This is 
the method that will be called when an instance of the class Banner is passed as the tar¬ 
get of the thread: 

public void run() 

The run method has two parts. The first part does some initialization. The second part 
does the actual animation. I’ll discuss the local variables as we come across them. 

FontMetrics fm = getFontMetries(bannerFont); 

The class FontMetrics contains information about the font family. We need that to 
properly position the banner within this box. 

Dimension d = getSize () ; 

This is the size of the component: its width and height. 

boxw = d. width; 
boxh = d.height; 

boxw and boxh are fields of this class. 

delta = 1; 

The amount (in pixels) by which the banner is moved to the left at each step. We 
choose to move the string by one pixel in each new step. This is a matter of trial and 
error and experience. A value that is too large will make it jerky. A very small value will 
make it computationally expensive. This example isn't very expensive, so we choose a 
small value. 
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sw = fm.stringWidth(bannerstring); 

Compute the width of this string in this font. 

n = sw/delta; 

The number of steps it takes to draw the string across the box - this is just the size of 
the box divided by the size of each step. 

X = new int[n]; 

Y = new int [n] ; 

Create arrays for holding the location of the first character of the banner. Each pair of 
elements (X[i], Y[i]) specifies a location of the first character of the banner string. 

X[0] = sw + (boxw - sw)/2; 

Y[0] « (boxh + fm.getAscent())/2; 

Set the zeroth element. 

for (int i = 1; i < n; i+ + ) 

{ 

X[i] = X[i - 1] - delta; 

Y[i] = Y[i - 1]; 

} 

The Y coordinate doesn’t change, but the X coordinate is moved over to the left. 
Remember that we are just initializing things here; no animation is being done. We are, 
however, inside the thread. It is possible to move this initialization phase outside of run 
to the time when the banner is first created, but we chose not to do it that way. 

while (true) 

Now we start the actual animation. This loop never ends, so we see the banner go 
round and round for ever. The loop just consists of setting the current location to a 
new position and then redrawing the screen - nothing complicated. 

for (int i = 0; i < X.length; i++) 

For each element of the array (could be X or Y here; they are both the same length) 

bannerx = X [ i]; 
bannery = Y [ i ]; 

Set the current location. 

repaint(); 

Repaint this component, repaint is a very important method. It has the effect of clear¬ 
ing the background and redrawing whatever needs to be redrawn. That is, it clears the 
background and calls the paint method. This is just what we need. When we change 
the position of the banner string, we want to rub it out at its previous location and 
draw it at the new location. This is just what repaint does. 

try 

{ 

Thread.sleep(100L); 

} 

catch(InterruptedException ex) {} 

Just sleep for a while between steps in the animation - to slow things down a little. 

try 

{ 

Thread.sleep(2000L); 

} 

catch(InterruptedException ex) {} 

After going through the entire animation, stop for a while before starting again. 
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Finally the all-important paint method must be provided: 


Dimension d = getSizeO; 

The size of the box; this should be the same as boxw and boxh that we set previously. 
We could use those just as well. 


Font f = getFont() ; 

Save the values of the current font and current color. This is not really necessary, but I 
have put it in just to show typical programming constructs. If you need to change col¬ 
ors and fonts multiple times, you will have to save old values if you need them later. 


public void paint(Graphics g) 


Color c = g.getColor(); 


If you need to change 
colors and fonts multiple 
times, you will have to 
save old values if you need 
them later. 



g.setColor(bgColor); 

g.fillRect(0, 0, d.width, d.height); 

Set the color and fill this box with that color. Remember that bgColor is black, so this 
will give us the black background we need. 

g.setColor(fgColor); 
g.serFont(bannerFont); 

g.drawstring(bannerString, bannerx, bannery); 

Change the color again for drawing the banner, set the font in which the banner will be 
drawn, and then draw the banner string at the current location. 

g.setFont(f); 
g.setColor(c); 

Strictly unnecessary, but you can reset the old values this way. 

Running Applets 

In order to view this applet we must now create an HTML file that has the name of the 
applet. This is how a browser identifies which Java class contains the applet that is to be 
run. The following is the HTML file for this applet. You can call this file 
animation.html or anything else with the same extension. 

<HTML> 

<HEAD> 

<TITLE>Animation Applet</TITLE> 

</HEAD> 

<BODY> 

<H2>Arimation Applet</H2> 

<CENTER> 

<APPLET CODEBASE="." CODE=''Animation, class" name=animat ion width=400 
height=300> 

<PARAM NAME=banner VALUE="I Love Austin"x/APPLET> 

</CENTER> 

<HR> 

</BODY> 

</HTML> 

As you can see, this is a very simple HTML file that will display a banner that says “I 
Love Austin.” (Its supposed to be a nice place.) 

Conclusion 

I have demonstrated by example the use of applets to do simple animation. In the 
process, we gained some insight into the power of the AWT and the use of threads. The 
Java environment is truly a rich one for writing powerful applications, be they console- 
based or applets. In subsequent articles we will look at inter-applet synchron 

import java.applet.*; 
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untruths and the truth 

This is a journey through the new authentication mechanisms. 

At a time when there is talk of “certainty of access,” of authentication of the 
users of a service, biometrics offers a solution. 

Introduction 

Biometric information - patterns of unique physiological and behavioral traits - can be 
used to authenticate access to given critical resources. Any organization may have 
strategic resources requiring protection. These can range from so-called sensitive data 
(data covered by laws concerning rights to privacy, for example), to mission-critical 
financial or commercial data, to military or legal information. 

In addition to protecting data, an authentication mechanism based on biometrics can 
permit selective (and therefore sufficiently safe) access to given rooms or structures for 
which there is an intrusion risk. 


System Components 

A biometric authentication device is made up of three components: 

■ A database of biometric data. As you would expect, this is a large store of physiologi¬ 
cal and/or behavioral data. The stored information is compared with the input given 
at the time of access. 

■ Input procedures and devices. These are the systems (biometric readers, means for 
carrying the information, etc.) that connect the would-be guest with the validation 
system. 

■ Output and graphical interfaces. This is the front end. It is used to enter and display 
part of the access data and to obtain responses from the system. 

Types of Biometric Data 

We have described the components of a biometric authentication system. Now 
we shall explain the types of physiological and behavioral information that can be 
authenticated. 

Possible physiological data include: 

■ retina prints 

■ fingerprints and palm prints 

■ voice prints 

■ keyboard input measurements 

■ iris recognition 

At the moment, some feel that recognition of the retina is adequate both from the 
point of view of safety and, importantly, from that of the bandwidth required for an 
on-line transaction. It has been calculated that a network of devices for biometric 
authentication will take up about 32 Kb/sec. If a system’s available bandwidth is 128 
Kb/sec, biometric authentication alone would eat up 25 percent of the total. Enough 
bandwidth must be reserved for other services, ranging from electronic mail to video- 
conferencing. 

For fingerprint recognition, Compaq Computer Corporation is offering its Fingerprint 
Identification Technology at under one hundred dollars. Guaranteed to be compatible 
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with Compaq Deskpro, Armada PCs, and Professional Workstations, it is currently in 
the testing phase for security applications in the Windows NT environment, and exper¬ 
iments are being carried out on domain access as a replacement for conventional pass¬ 
words. The fingerprint reader is placed near the video terminal and linked to a serial 
port. It can be integrated with SmartCard use. 

In addition to fingerprint scanning, voice recognition, and, in some military applica¬ 
tions, dynamic measurements of character entry via a keyboard, biometric recognition 
can be based on the patterns of the iris. The iris of the eye has a unique and visible 
structure which is not currently possible to duplicate. It has been ascertained that the 
human iris can identify an individual as accurately as his DNA. What is more, the iris is 
stable throughout an individuals lifetime. 

Iris recognition is considered to be “just short of infallible,” definitely more foolproof 
than fingerprint recognition. The pattern of the iris can be compared with the informa¬ 
tion contained in an IrisCode database, with over 266 options for each record. The 
scanning method is definitely one of the most transparent, since no physical contact 
with the scanner is required. 

The most crucial operation in iris recognition is the scan for the record in the database. 
Strange as it may seem, the best results are achieved with a black-and-white camera. 
According to IrisScan, the developer of iris recognition technology, black-and-white 
scanning eliminates the possibility of incorrect recognition due to such factors as nar¬ 
cotic or prescription drug use or colored contact lenses. 


Iris recognition is 
considered to be “just 
short of infallible, ” 
definitely more foolproof 
than fingerprint 
recognition. 


Reliability 

What risk of error is there when using biometrics to control accesses? Many believe 
the risk to be infinitesimal. Others are concerned, not so much about possible counter¬ 
feiting of the physiological data of an individual, as about error on the part of the 
scanners. 

Scanner manufacturers deny the imputation. At the Sicur 98 meeting in Madrid, 
Norberto Cartagena, sales manager of Ultra Scan of Miami, Florida, a firm that has 
been active in the United States for over ten years, stated that at least for fingerprints, 
the scanners currently available are reliable. Cartagena does not deny the need to opti¬ 
mize some of their features; however, he sees this as part of the normal product up-dat¬ 
ing roadmap. 


Biometrics and the Internet 

An interesting step forward in integration between biometric devices and information 
systems linked to the Internet has been made by iNTELiTRAK Technologies Inc. Their 
CITADEL GateKeeper has recently received security certification from the ICSA (for¬ 
merly NCSA). The objective is to enable authentication of users of the Internet, an 
intranet, or an extranet, not by password or other such standard means, but by voice- 
pattern recognition. 

CITADEL GateKeeper works as follows: 

■ The remote user links up with the authentication service via IP network or by tele¬ 
phone. 

■ Once the link has been established, she follows the authentication instructions. (It is 
worth noting that the voice input may be provided through a Sound Blaster-compat¬ 
ible microphone.) 
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■ Gatekeeper carries out an analysis of the voice, comparing it with its authentication 
database, which can also interact with, among other things, X509V3 digital certifi¬ 
cates. If a match is found, the system permits access to the information structures. If 
not, it follows an administrator-defined procedure to report an intrusion attempt. 

Combining biometrics with conventional authentication methods carried out by, for 
example, a firewall or RADIUS server reduces the success rate of sniffers to a mini¬ 
mum. The system is fairly simple to use and also to integrate. The only concern lies in 
the error rate of the biometric recognition method. However, when the scanning and 
voice-pattern sampling, as well as the voice recognition, are carried out at different fre¬ 
quencies, maximum granularity should be attained. 

Administration of Biometric Systems 

Security operators recommend that the biometric database be administered by the 
security manager rather than a database administrator and that remuneration be 
directly proportional to the type of strategic resource being protected. This ensures 
both optimal safeguarding of the operator and a sense of responsibility for the project. 

To Hash or Not to Hash? 

Hashing, or calculating a numerical value for an input, usually based on the length of 
the datum in question, is intended to ensure the integrity of the hashed number during 
transmission via a network. Generally speaking, cryptographic algorithms are used to 
generate these functions and to code them. Hashing functions are currently used by 
programs such as PGR It has been asked recently whether hashing functions should or 
could be added to biometric databases. Most experts feel that methods such as iris 
recognition are sufficiently safe, in particular when combined with the use of 
SmartCards. 

Has the Time Come for Biometrics? 

I recently talked with Cyril G. Reif, Director of Industry Technology, Financial Services 
Industry, at Sun Microsystems. Reif, who manages world-level accounts for Sun, report¬ 
ed on some comments heard from people in the banking sector. “Although the banking 
world does not exclude future use of biometrics in ATMs [Automatic Teller Machines], 
it is somewhat doubtful about this possibility, basically for reasons of lack of flexibility 
of use. They may possibly be used in the future; however, systems based on a Java 
SmartCard and on X509 digital signatures, which are currently the standard, are 
thought to be sufficiently safe.” 

In a recent interview in Foster City, Stephen Schapp, Deputy Chairman, Emerging 
Electronic Payments, of Visa International, confirmed that it is possible to have elec¬ 
tronic payment methods interact with biometric authentication devices. Schapp him¬ 
self, however, expressed some doubt about the use of authentication based on finger¬ 
prints, at least for ATMs. He felt that the scanning, checking, and authentication proce¬ 
dures would today require too much of the bank WANs bandwidth. On the other 
hand, Schapp felt that it would be possible to use retina scanning in the future, 
although the applications based on this type of method require optimization. 

Pilot implementations of this type have already been started by Visa International in 
the framework of the now famous Visa Open Platform project, introducing a suite of 
Financial services based on new-generation SmartCards. Visa has implemented a Java- 
language software layer between the operating system of the card and the applications. 
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This layer acts as a bonding agent between the components described above, enabling 
multiple uses of a single SmartCard in electronic commerce. 


In the past two months I’ve been traveling around Europe seeking to increase my 
knowledge of biometric products. I’m torn between two scenarios: digital fingerprint 
applications and iris recognition. Fingerprint-recognition vendors such as Siemens, 
Digital Persona, and Compaq are pushing their products hard, but customers are wary 
of the possibility of scanning errors. Iris-recognition vendors such as Iriscan, Sensar, 
Olivetti, and WangGlobal offer a very interesting alternative, yet one that threatens 
excessively high costs for ATM implementation. My personal conclusion: we’ll need to 
wait another eight to ten months for a solid biometric system - a reasonable period in 
the IT world. 


source code UNIX 

Embedding Source Code UNIX in the Product 


The Essence is source code. If you have it, destiny is in your own hands, bring¬ 
ing flexibility and self-directed control. If not, you may fall prey to the whims of 
a large corporation and have to plead for bug fixes, enhancements, and device 
drivers. 

Lets consider some of the advantages and disadvantages of using Source Code UNIX as 
the operating system controlling an embedded product. I’ll examine the issues, the 
choices, and some concrete successes. 

An ‘embedded product” has a processor (CPU), some memory, and a controlling pro¬ 
gram to produce the required features and functionality. Household appliances these 
days are often embedded products. A typical refrigerator has sensors to monitor the 
environment, including temperature and humidity. Based on these and the user-set- 
table parameters (user interface), the controlling program gives commands to fans, 
motors, heaters, and lights to achieve the desired results. 

Mass-produced embedded products have characteristics and economies of scale that 
warrant custom software - possibly cast in stone (or silicon). Usually the software is 
relatively simple and unchanging. Minimizing the cost of goods, especially the hard¬ 
ware, is extremely important. When you are building a million cars, stereos, or dish¬ 
washers, you’ll choose to spend $500,000 on a custom ASIC (custom application-spe¬ 
cific IC) if it will save a few bucks per unit. 

The embedded products that this article discusses resemble general-purpose worksta¬ 
tions. Medical-imaging systems, laser printers, gateway routers, and vision systems have 
powerful processors, many megabytes of memory, disks, communications channels, 
and user interfaces. Often such products require virtual memory and multitasking. For 
these kinds of systems, the prices and quantities sold have characteristics different from 
more common and abundant appliances. In most cases, a short development time is 
more important than a small cost reduction for the hardware. The case for Source 
Code UNIX is strong for these products. 
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Operating System Features 

Lets look at the issues involved when choosing software for controlling an embedded 
system. You 11 want to consider how much total development effort you can afford and 
how long the development can take. Most organizations are eager to get things working 
in the shortest amount of time using a modest engineering team. Leveraging an exist¬ 
ing, appropriate operating system will save costly software-development time. Further, 
risks are reduced because you start with part of the project already completed. 

A key phrase in the above paragraph is “appropriate operating system.” It wouldn’t 
make sense to use the sophisticated UNIX operating system to control a washing 
machine. You need to look at your requirements and choose an OS that covers almost 
all of the needs. But plan for future features and enhancements, because most success¬ 
ful products have follow-on versions that place more demands on the software. For 
example, maybe the initial version of your product operates in standalone mode. If it is 
eventually going to be connected to other devices, you should think about an operating 
system that has network protocol stacks. Similarly, if “release 1.0” has all processes in 
physical memory, but you can see a requirement to add many more tasks in the future, 
you might want to start with a virtual-memory, paging system. Here are some ques¬ 
tions to ask: 

■ Do you need multitasking, priority scheduling, and overlapped CPU-I/O? 

■ Do you need separate, memory-protected address spaces? 

■ Do you need networking capabilities? 

■ Do you need realtime capabilities (hard or soft)? 

All of the above are large undertakings. Do you have the time and resources to build 
them yourself? 

Operating System Choices 

Lets say that you are the head software engineer responsible for building the product. If 
you are in from the beginning, you’ll have some say in the selection of a processor type, 
a bus technology, a memory system, and peripherals. For example, MIPS has the largest 
number of embedded processors in the field - mostly because of its price/performance 
advantage over expensive chips such as the Pentium. But choosing this processor has to 
be balanced with the software-development costs. If you develop on Pentiums, you may 
need a cross-compiling environment and will sometimes be faced with problems such 
as endianness. For modest I/O requirements, a PCI bus will suffice; otherwise, you’ll 
need a proprietary design. It’s best to stick with mainstream components unless there 
are compelling reasons to go with specialty items. While considering hardware, look at 
your software choices and balance everything with cost, development effort, and time 
to market. Don’t forget the cost of testing and maintenance. 

The main Source Code UNIX systems, Linux and the BSDs, cover the popular proces¬ 
sors: Motorola 680X0, Pentium, SPARC, PowerPC, StrongARM, MIPS. The same is true 
for the commercial realtime operating systems. (I’ll use the term RTOS for the com¬ 
mercial embedded operating systems below. Most have hard realtime scheduling priori¬ 
ties.) Your choice between Source Code UNIX and an RTOS will boil down to what 
features are provided and at what cost. Rarely is source code available for an RTOS at a 
modest price. Often, you’ll have up-front costs and per-unit costs to use RTOS binaries 
in your product. 

The primary reason to consider Source Code UNIX is self-directed control of software 
development. You can examine the code to see in detail what is going on and modify it 
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to achieve your objectives. If you are lucky, there will be an RTOS that closely meets 
your requirements - it might not require any changes. But if it lacks features or perfor¬ 
mance crucial to your product, you could be in trouble. The vendor might be willing to 
make custom changes for you, but that will cost you time and money. Depending on 
the size and nature of the modifications, your budget or schedule might be broken. 

In contrast, if you have source code, you always have the option of making enhancements 
yourself If the system is misbehaving, you can track down the problem with your own 
resources instead of waiting for the vendor to get to it. 

A unified environment for development and the embedded system enhances productiv¬ 
ity. Any time you can carry out a large portion of software development and testing on 
your workstation, you improve the development cycle. The value of familiar compilers 
and tools should not be underestimated. While waiting for your hardware engineers to 
build your custom components, you may be able to simulate the equipment on a gen¬ 
eral-purpose workstation. 


Any time you can carry out 
a large portion of software 
development and testing 
on your workstation, you 
improve the development 
cycle. The value of familiar 
compilers and tools should 


Examples of Embedded-Systems Projects 


not be underestimated. 


A Video-Recorder Product 

In late 1995, as head of software in a tiny startup company, I had the job of designing 
and implementing the software to run a disk-based, high-end, video-recorder product. 
The combined audio/video data rate required storing or retrieving at about 30MB/sec- 
ond. Commodity disks of the day could only sustain a rate of about 5MB/second. 
Clearly we had to stripe the data across multiple disks. We also required redundancy 
against a single disk failure and achieved it with RAID3. We homed in on the main¬ 
stream technology of SCSI disks, PCI buses, and Intel Pentiums. 

The naive hardware guys thought that NT would be the ideal software to run our prod¬ 
uct, but they had been brainwashed by a large corporate marketing organization, and 
they really didn’t understand the issues. Given that we were going to build our own 
video card and motherboard (containing 20 SCSI buses, 3 PCI buses, Ethernet, and a 
high-bandwidth memory system with its own XOR engine), I saw that we needed 
source code to achieve our ambitious goals. The complexity of the product required a 
multitasking, virtual-memory, demand-paged operating system. We needed network 
protocols, a graphical user interface, and control over process scheduling. I knew we 
would require many enhancements in the SCSI subsystem for error recovery and hot 
swapability. Source Code UNIX was a great fit for the problem. Linux was a candidate, 
but I ended up choosing FreeBSD because of the better SCSI-driver support and 
because I had experience with BSD UNIX dating back to the late 1970s. 

We leveraged many parts of FreeBSD source-code UNIX - many of which are not avail¬ 
able in an RTOS. We had to get into the kernel to tweak PCI-bridge code, customize the 
serial driver, and make a handful of changes to boot-up code. We put much effort into 
improving the SCSI driver for our RAID system - the generic improvements were given 
back to the UNIX community. Our GUI leveraged XI1 and Tcl/Tk. The product can be 
remotely upgraded with CVSUP (a software package for distributing and updating 
source trees from a master CVS repository: see <http://www.freebsd.org/handbook>). Think of 
this as an elaborate EEPROM upgrade, such as you would do for a flashable modem. 

We came up with a clever file-conversion mechanism using a lazy evaluation mecha¬ 
nism that leverages the UNIX vnode infrastructure. By accessing virtual files in the 
name space, implicit color-space conversions and file wrappers are invoked. See the 
code in BSD kernels under sys/miscfs for examples. We served network files to 
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UNIX, Macs, and PCs using NFS and Samba. We wrote numerous custom drivers for 
our hardware (temperature sensors, XOR engine, LED displays, etc.). We had existing 
drivers as reference guides, so extending the code to match, for example, the DEC 
21143 Ethernet chip was easy. We built the hardware and software from scratch with a 
team of 10 in 14 months. I can’t imagine completing such a project in a reasonable 
amount of time without source code. 

A Laser-Printer Product 

A few of us are currently improving the performance of a medium-priced laser printer. 
It needs a full-featured operating system to support tasks like interpreting and printing 
PostScript, PCL, and rendering images, graphics, and other material. Printers have disks 
for spooling, fonts, and paging. Interfaces to the printer include serial and parallel com¬ 
munications and Ethernet. The printer can receive jobs via FTP or spooling mecha¬ 
nisms, or just by sending the bits. Status and control are available using Telnet or HTTP 
Web interfaces. Printers of the current generation have huge virtual-memory needs. An 
8.5 x 11 image with vertical resolution of 1200 dpi and horizontal resolution of 2400 
dpi has about 256 mega pixels. Using a byte for each of cyan, magenta, yellow, and 
black planes (CMYK) requires 1GB to hold an image. Then multiply that by double¬ 
sided, multiple-copy, and n-up printing. Don’t forget that page rasters have to be 
retained in case of an error such as a page jam. I think you can see that printers have 
more to their operating systems than it may first appear. We’re using an older Source 
Code UNIX-like system now, but plan to move the next generation MIPS R5000 prod¬ 
uct to an OpenBSD base. 

I’d like to point out that one of the remaining inefficiencies of the printer product is 
TCP throughput. This is one of the few pieces for which we do not have source code. 
The associated vendor is not very interested in improving his microcode. 

An Image-Processing System 

In the late 1970s, we were awarded a contract to build an image-processing system for 
the Defense Mapping Agency. It would include a flatbed scanner, stereoscopic, high- 
resolution displays, and adjunct floating-point processors. The commodity processors 
of the time were PDP-1 Is, and the mainstream languages were assembler and Fortran, 
under the DEC RSX11M operating system. We convinced those in power to go out on a 
limb by using C with UNIX. We argued that we could deliver a more feature-rich sys¬ 
tem sooner if we used a systems high-level language (C) and a Source Code UNIX. 
Having source code allowed me to write a set of high-speed file routines (which were 
given back to the community) to achieve the required I/O bandwidth. Many of the 
same concepts are now mainstream in McKusick’s Fast File System, which is the stan¬ 
dard UNIX File System (UFS). 

As a part of the image-processing system, we had a high-speed A/D converter connect¬ 
ed to a PDP-11. The general belief was that even though UNIX had all of the right aux¬ 
iliary processing tools for the collected A/D data, we would need a “realtime” kernel to 
interface to the device. We had to spend only a couple of days to write a device driver 
that sampled the data at interrupt time and buffered it. Ten lines of UNIX code were 
changed. No samples were lost, and UNIX happily ran on the same computer. 

Other Commercial Products Using Embedded Source Code UNIX 

Cobalt Networks Qube 2 is a low-cost server appliance that provides Internet connec¬ 
tivity, email, Web publishing, and other Web and network file services. Running Linux 
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on a 250MHz MIPS, it allows developers to add Web-server-based applications. See 
<http://www.cobaltmicro.com>. 

The Whistle Interjet, using FreeBSD, provides email, Web access, and Web publishing 
for a small office. Whistle has been an active partner with the Open Source community, 
using and contributing to the Open Source projects. See <http://www.whistle.com>. 

When Not to Use UNIX 

Some RTOS vendors have extremely good products. Their code might fit your problem 
like a glove. If your problem matches the features of an existing RTOS but doesn’t fit 
the UNIX model, your choice is easy. Remember, many of the commercial vendors have 
a long history of serving particular areas in the market. Their mature product could be 
what you should use. 

At first, I was going to recommend against UNIX for the small, nonpaged systems, but 
a Source Code UNIX can be stripped down to fit many situations. PicoBSD demon¬ 
strates that there are advantages for using UNIX even in tiny environments; it is a one- 
floppy version of FreeBSD, which in its different variations allows you to have secure 
dialup access, a small diskless router, or even a dial-in server. And all this on only one 
standard 1.44MB floppy. It runs on a minimum 386SX CPU with 8MB of RAM (no 
HDD required!). See <http://www.freebsd.org/~picobsd/picobsd.html>. Also see Linux tiny pro¬ 
jects, below. 

I would say that UNIX is a waste if you have only a single process. If you have no 
resources to control, you don’t need an operating system, but as soon as you want to 
have multitasking and process scheduling, the advantages begin accumulating. 


PicoBSD demonstrates 
that there are advantages 
for using UNIX even in tiny 
environments; it is a one- 
floppy version of FreeBSD, 
which in its different 
variations allows you to 
have secure dialup access, 
a small diskless router, or 
even a dial-in server. 


Conclusion 

Building good software in an embedded product is hard. For some applications, a com¬ 
mercial RTOS will fit the problem, but there are a large set of applications where a 
binary-only operating system will get in the way. Developers who have had UNIX 
source code know the tradeoffs. Programmers who have never had access to an open- 
source operating system might not appreciate the advantages. Peter Neumann[l] sums 
it up nicely: “The potential benefits of robust open-source software are worthy of con¬ 
siderable collaborative effort.” Give it a try. 


Reference 

[1] Peter G. Neumann, “Robust Open-Source Software,” Communications of the ACM, 
February 1999, Vol. 42, No. 2. See also <http://www.opensource.org> and <http://www.gnu.org>. 

Resources for Source Code Unix 

You may feel that some consulting help or commercial support would best fit your 
needs. You have a number of options, including hiring consultants such as those found 
at: 

<http://www.freebsd.org/commercial/consulting_bycat.html> 

<http://www.openbsd.org/support.html> 

<http://www.linux.org/business/index.html> 

<http://metalab.unc.edu/LDP/HOWTO/Consultants-HOWTO.html> 

You could turn to companies that specialize in embedded Source Code UNIX: 

Cygnus is a leader in Open Source-based software development tools, mission-critical 
support, and custom engineering for the embedded-systems market. 
<http://www.cygnus.com> 
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Boulder Labs specializes in high-bandwidth embedded UNIX systems. 
<http://www.boulderlabs.com> 

RTMX O/S is based on OpenBSD; they add realtime scheduling, high-resolution 
timers, and contiguous file support. Complete source code for the RTMX O/S exten¬ 
sions and drivers is available. They support a large set of processors including those 
listed at <http://www.rtmx.com>. 

Linux/Microcontroller project: <http://ryeham.ee.ryerson.ca/uClinux muLinux> 

(micro-Linux): <http://www4.pisoft.it/~andreoli/mulinux.html> 

Linux Router Project: <http://www.linuxrouter.org> 

Linux on one floppy disk: <www.toms.net/~toehser/rb> 

Resources For Commercial RTOS 

pSOSystem has realtime multitasking kernel and networking support for a large set of 
processors. <http://www.isi.com/Products/pSOS/index.html> 

VxWorks, from Wind River Systems, is a realtime operating system with networking 
facilities, <http://www.wrs.com/products/html/vxworks.html> 

Inferno is designed to be a complete solution for the embedded market, uniting operat¬ 
ing-system functionality, networking, and security within a small-footprint OS plat¬ 
form. <http://www.lucent.com/inferno> 

BeOS was designed as an operating system for processor-intensive multimedia and 
Internet applications, <http://www.be.com> 

LynxOS is a scalable realtime operating system with UNIX/POSIX APIs. LynxOS looks 
and feels like UNIX but was developed with deterministic hard realtime response in 
mind, <http://www.lynx.com> 

QNX is a realtime, extensible POSIX OS with a lean microkernel and a team of option¬ 
al cooperating processes. This flexible architecture lets you scale QNX down for lean 
embedded systems or scale it out to create a virtual supercomputer orchestrating hun¬ 
dreds of processors, <http://www.qnx.com> 
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musings 


By now, you are all aware that Microsoft has ported Office to Linux. I have used 
a Beta version of the port and can report that Word works just like it does 
under Windows, including all of its annoying features (such as underlining 
phrases “it” doesn’t understand). 

Bill Gates, in another of his famous “about-face” speeches, announced that Linux was 
obviously the wave of the future, and that Microsoft would soon be releasing exten¬ 
sions that would make Linux more compatible with Microsoft products. The source 
code to these extensions, and the underlying APIs, would require a standard Microsoft 
nondisclosure agreement to view. 

Alas, I can no longer get away from using Office. Many of my friends and colleagues 
already consider Microsoft formats the lingua franca of the Internet and send me Word 
documents as email and schedules in Excel spreadsheets, and expect me to produce 
seminars in PowerPoint. I had been returning the documents and asking for something 
that I can read in vi, but Office for Linux makes it looks like dodging the issue will no 
longer be possible. 

April fool. While it is true that Sun, HP, and Silicon Graphics announced that they 
would support a Linux port to their hardware, I don’t expect that Microsoft will offer a 
port of Office until it is too late to do them any good. 



<rik@spirit.com> 


by Rik Farrow 
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author of UNIX System 
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Hard Facts 

Hardware vendors sell hardware. The operating systems help them to sell that hard¬ 
ware, which, you must admit, will just sit there uselessly without it. UNIX hardware 
vendors have traditionally used their operating systems as a means of differentiating 
their product offerings, when all we wanted them to do was make the operating sys¬ 
tems work as alike as possible. Vendors instead put a lot of time, money, and effort in 
doing exactly the opposite. Now perhaps we can get what we have been asking for since 
the 80s. 

Of course, the wider acceptance of Linux could also be its death knell. As Linux 
becomes more commercialized, different vendors will begin to include extensions that 
will make their version less interoperable with other versions. Even if these extensions 
are open source, they can still lead to problems. 

Take for example the current scheme in HP/UX 10 and 11 for shadowing the password 
file. Instead of having a single shadow file that includes the encrypted passwords as well 
as additional, useful information (e.g., password aging and account expiration date), 
the HP/UX system uses separate files for each user account. These files are organized by 
directory, with a different directory for each letter of the alphabet. Most people that I 
talk to at the HP World conferences do not use shadowing under HP as a result of this. 
Now imagine that HP decides to “extend” Linux to use this same, overly complex 
scheme. Yikes! There goes the single version of your account management tool. 

While I have great hope for the future of Linux, I am concerned about commercializa¬ 
tion. Will Linux programmers “sell out”? Will there be Linux wars, like the UNIX wars 
of the late 80s and early 90s? I certainly hope not. 

The Desktop 

And while Microsoft won’t be supporting Linux any time soon, I think that many of us 
would agree that Linux makes more sense on desktops than does NT. But there are 
some problems here, many in the area of the ability of mere mortals to manage a UNIX 
system. 
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AT&T made an attempt to put UNIX on desktops. They created a stripped-down ver¬ 
sion of UNIX without most of the tools you would use in ordinary script writing (this 
was pre-Perl and Tel). The result was a total failure. The operating system was too cost¬ 
ly for desktops (about as much as NT is now), too hard for the user to manage, and 
unmanageable without the standard set of UNIX tools. And besides, a stripped-down 
UNIX system was just not politically acceptable to the very people who already used 
and managed UNIX systems. 

Linux and other versions of UNIX today are just as unmanageable. While I do like a lot 
of the tools that come with Red Hat 5.1, it took a lot of man page reading and good oP 
UNIX-style futzing around to get pppd working from my laptop. I would not ask a 
Windows 9x or NT user to do that. And I would certainly not ask my mom. 

The very flexibility of UNIX defeats most schemes designed to make it more manage¬ 
able. Yet, something must be done if UNIX is ever to take over desktops. Apple is work¬ 
ing on this right now. MacOS 10, rumored to be out this year, will have the Apple desk¬ 
top and applications sitting above the NextOS (Mach) kernel. Terminal windows and 
the UNIX command line may be there (I hope), and there was also supposed to be a 
Java Virtual Machine. Will Apple succeed where others have failed? I don’t know, but I 
do wish them luck. 

Devices 

The other big edge that Microsoft has over Linux is in hardware support. They have 
this because they have lots of person-power to manage and include thousands of differ¬ 
ent device drivers. And any hardware vendor is willing to write a decent device driver 
for Windows (and maybe, just maybe, NT too). Sun Microsystems has been the out¬ 
standing exception so far. 

The list of supported devices (and motherboards) for Windows is perhaps an order of 
magnitude larger than it is for Linux or any other version of UNIX. While Linux 
already runs on more different processors than any other operating system, its list of 
supported devices is much smaller. Have you tried to install something other than a 
Sound Blaster sound card in a Linux system? If Linux won’t make noise when the user 
plays games, it’s all over for the desktop market (even if I consider Jaz drive support 
more important). 

This will be the first barrier that must be overcome for the broader acceptance of Linux 
or any version of UNIX. Easy installation of Linux relies on the right device drivers 
being present or easy-to-get-and-install. Already, installing most UNIX systems is much 
easier (barring device-driver problems) than NT, and about four times faster. 

Of course, there is one other really nasty device problem - and again Microsoft really 
lords it over UNIX in this area - and that is in setting up the windowing system. I am 
sure most of us have been through this in installing X on a PC. You must choose the 
correct timings for the X-window server to work correctly, and this is usually a hit-or- 
miss operation. What’s worse, the instructions tell you that poor choice of timing val¬ 
ues might damage your hardware. Yikes! This “feature” helps me to understand why so 
many of my UNIX friends run Windows on their notebooks. 

It would be nice if Sun’s Jini would help here. Jini is an object-oriented interface to 
devices that makes it possible to just connect a device to a network and have it instantly 
available and able to communicate without any device-driver installation or configura¬ 
tion. Essentially, software (Java, but it could be other) queries the device for its inter¬ 
face, and the device provides its own device drivers. 
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Jini won’t work for bus-installed devices. Bus-installed devices are designed to work as 
fast as possible (that is, at bus speeds designed to rival memory speeds), and I don’t 
think that Jini will work for disk, CD-ROM, sound cards, and other devices where 
latency is critical. I may be wrong, but I doubt it. 

The real problem in writing device drivers has been corraling the device vendors. 
Hardware vendors pretty much do as they please. We have IBM to thank first, and Intel 
more recently, for there being any compatibility between PC devices at all. There have 
been attempts to standardize device interfaces, but none have succeeded at the software 
level necessary for making writing device drivers easy. Just getting data sheets from 
some vendors has proven impossible - unless perhaps it is Microsoft asking. 

None of these problems is insurmountable, especially considering the legions of pro¬ 
grammers who are working on them. 

So I have a favor to ask. I would like those of you who are working on Linux or the 
BSD versions to send me article proposals on work designed to solve these problems. I 
am also interested in articles about how each organization decides what new features or 
code to include in each release, articles about the performance and reliability of the 
chosen filesystems, and plans to make Linux or other UNIX versions more manageable 
for mere mortals. I will propose having a special issue about this in ;login: this year 
based on the response to this plea. 

Please don’t send me completed articles. Send me a page or two that clearly shows me 
that you know what you want to write about, know your topic, and can write. Don’t 
worry if you are not an excellent writer - USENIX has a wonderful copy editor who 
will help you. Let’s see if we can use ;login: to advance the future of operating systems 
in general, and UNIX in particular. 

And please do not send me Word or any other Office documents. My bit bucket may be 
bottomless, but why waste your time? 


I would like those of you 
who are working on Linux 
or the BSD versions to 
send me article proposals 
... for a special issue in 
;login: this year. 
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The following Reports are 
published in this column: 

Open Source - A Standards Success Story? 

Is Linux the Future of POSIX? 

Our Standards Report Editor, Nick Stoughton, 
welcomes dialogue between this column and 
you, the readers. Please send your comments 
to: <nick@usenix.org> 



<nick@usenix.org> 


edited by Nicholas 
M. Stoughton 

USENIX Standards Liaison 


Open Source - A Standards Success 
Story? 

Nicholas M. Stoughton <nick@usenix.org> 

This year, the buzz is all about Open 
Source. The LISA ’98 keynote speech, by 
Eric Allman, was about Open Source. 
Then, of course, there were the now infa¬ 
mous Hallowe’en documents from 
Microsoft. The idea has been around for 
a long time, but Eric Raymond’s paper 
“The Cathedral and the Bazaar” really 
sparked the most recent formalization of 
the concept. Those of you who want to 
know more about the concept should 
look at <www.opensource.org>. 

Open Source software is extremely suc¬ 
cessful, in general. Without standards, it 
would probably still succeed, but it would 
have a far more limited appeal. Just how 
many successful Open Source applica¬ 
tions are there that don’t have a POSIX- 
style system as the base starting point 
(even if they have been successfully port¬ 
ed to non-POSIX systems)? However, 
developers of Open Source software still 
have to riddle their code with #ifdef’s 
to allow for the enormous varieties of 
systems that are in existence. If the stan¬ 
dards community had really achieved 
everything it wanted to, far less of this 
conditional compilation would be need¬ 
ed. But at least the majority of systems 
behave in a largely similar fashion, and 
that really is thanks to standards. In “The 
Cathedral and the Bazaar,” Eric Raymond 
points out, “When your code is getting 
both better and simpler, that is when you 
know it’s right.” The fewer the differences 


between systems, the easier it is to 
achieve this goal. 

The idea that standards such as POSIX 
make source portability easier is most rel¬ 
evant when the source itself is open. 
Typical closed or proprietary applications 
do not worry so much about standards; 
they are out to use all the proprietary 
interfaces they can, to squeeze every last 
drop of performance out of the systems 
they have targeted. Just how much easier 
has POSIX made life for a company such 
as Oracle? (I am not qualified to answer 
that question personally, but I’d be very 
interested to hear a response from you if 
you work for an organization like that.) 
But how much easier has POSIX made it 
for the adoption of BIND or Apache? 

It is very interesting to look through 
some of these Open Source applications 
and see where there is a lack of standard¬ 
ization. Perhaps we in the standards busi¬ 
ness should take more notice of these. 

For example, which header files are need¬ 
ed, and in what order; where certain files 
are located; and how options are handled 
regularly cause problems. These are the 
areas we need to consider in the current 
revision of POSIX and The Open Group’s 
Single UNIX Specification. 

And what of that project? Last 
September, representatives from The 
Open Group’s Base working group, the 
IEEE Portable Applications Standards 
Committee (PASC), and ISO/IEC 
JTC1/SC22/WG15 (POSIX) met in 
Austin, Texas, to agree that the POSIX 
standards needed revising and to decide 
to do the work together. Because of the 
location, the group has become known as 
the Austin Group. It has taken six months 
to settle the basic ground rules for how 
that group will operate, but finally we 
appear to be ready to start the real work. 
Projects will be sponsored in each of the 
organizations to produce a single com¬ 
mon set of specifications - initially four, 
covering what is now in the two POSIX. 1 
and POSIX.2 standards. The four books 
planned are: the system APIs (currently 
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POSIX.l and XSH); Commands and 
Utilities (POSIX.2 and XCU); a new 
common definitions book; and a separate 
guide to using the standards, to which the 
rationale will be moved. The next meet¬ 
ing is planned for early March, with sub¬ 
sequent meetings in Montreal in July and 
somewhere in Europe (probably 
Copenhagen) in the fall. The first drafts 
of these four books should be ready 
before the year 2000 rolls around. 

All Austin Group meetings are open to 
anyone; there is no membership require¬ 
ment. All the drafts are intended to be 
freely available via the Web for comment. 
By helping to get these standards correct, 
useful, and usable, we can ensure a bright 
future for Open Source developers, our 
real target audience. 

Is Linux the Future of POSIX? 

David Blackwood <dave@rc.gc.ca> 

On December 15, 1998, Stephen Michell, 
chair of the Canadian Standards 
Association Committee on Programming 
Languages (CSA/CPL, equivalent to the 
US ISO/IEC JTC1/SC22 Technical 
Advisory Group, or TAG), gave a presen¬ 
tation to the Ottawa Carleton UNIX 
Users’ Group (OCUUG) entitled 
“Standardization of Programming 
Languages - A Challenge for the 1990’s.” 

“The 1980s and the 1990s have seen an 
explosion of Information Technology 
tools, development environments and 
development languages. Machine code, 
assemblers, imperative programming lan¬ 
guages, Fourth Generation languages, 
Visual development languages, each 
promise to give us the edge in perfor¬ 
mance, code quality, development speed, 
simplicity for users and much more. As 
each new language or technology comes 
along, it is often embraced with enthusi¬ 
asm and intense interest by some seg¬ 
ment of the industry. It is only after a 
number of projects and challenging expe¬ 
riences that practitioners realize that they 
are locking themselves into vendor-spe¬ 
cific solutions that make the task of 


upgrading, porting, and training new 
people much more difficult than they 
had anticipated. Alternatively, they are 
working in a mature language or technol¬ 
ogy but need the expressive power and 
concepts introduced by newer technolo¬ 
gies, but preserving their investment in 
existing systems. These are the challenges 
of language standardization in the 1990’s. 
New languages and concepts arrive that 
somehow need a globalization and solidi¬ 
fying process to guarantee portable tools 
and systems to practitioners. New con¬ 
cepts must be introduced into mature 
languages so that mature systems can 
adopt and grow with new technologies. 
International character sets and location 
profiles must somehow be integrated. 

And a world consensus is needed. The 
presentation examined the challenges 
faced by language standardization and 
how the ISO/IEC language standards 
groups cope to guarantee the most up-to- 
date and portable languages that can be 
imagined. - And we do it by consensus.” 

Present that evening were a number of 
members of the Linux community who 
do not normally attend OCUUG meet¬ 
ings, preferring instead to attend their 
own Ottawa Carleton Linux User’s Group 
(OCLUG) meetings. However, at the sug¬ 
gestion of a few individuals that OCLUG 
would be receptive to an invitation to this 
presentation, one had been extended to 
them several weeks earlier. 

Attendance was good and several Linux 
users stepped forward afterward to 
express their interest in becoming 
involved in standards activities including 
in the Canadian POSIX Working Group 
(CPWG, equivalent to the US WG15 
TAG). It remains to be seen how many 
actually follow through; however, the 
high level of interest was encouraging. 

Interestingly, the week previous to this 
happening, discussions were held at the 
USENIX LISA conference between mem¬ 
ber of the standards and Linux commu¬ 
nities about bringing more Linux users 


and developers into the formal standards 
process at the U.S. national and interna¬ 
tional levels. The outcome of these dis¬ 
cussions was also encouraging. USENIX 
is considering sponsoring two representa¬ 
tives from the Linux community to par¬ 
ticipate in the U.S. national standards 
process. 


One possible measure of system maturity 
is the emergence of viable divergent 
implementations. UNIX reached this 
point with the release of 2BSD in 1978 
and continues the tradition today despite 
the POSIX family of standards. Some 
blame the failure of POSIX to create a 
“one true UNIX” for the rise of Windows 
NT as a potential competitor in the high- 
end workstation and server market. I 
believe that, seeking to avoid the mistakes 
of UNIX vendors in the past, the Linux 
community is beginning to realize the 
value of standards, and hopefully the 
value of POSIX. I would only further 
encourage them to use the influence of 
their numbers to direct the future course 
of POSIX and not to create yet another 
competing alternative in an already frag¬ 
mented marketplace. 


While Linux continues to gain mind 
share as well as market share, it struggles 
with how to handle product differentia¬ 
tion at the commercial level and yet try 
to maintain a single source tree for devel¬ 
opers. Support from more traditional 
UNIX vendors continues to grow with 
recent announcements from HP, IBM, 
and Compaq. They join Oracle, Corel ,k 
and a multitude of smaller software ven¬ 
dors in providing support for the plat¬ 
form. 


Is Linux the future of POSIX? Quite pos¬ 
sibly. At the very least the future of 
POSIX looks a lot brighter if it includes 
Linux than if it doesn’t. Sun, SCO, and 
you other UNIX vendors, are you 
listening? 
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Cambridge, MA: MIT Press, 1998. Pp. 398. 
ISBN 0-262-03255-4 

Dorothy E. Denning 

Information Warfare and Security 

Reading, MA: Addison Wesley, 1999. Pp. 522. 
ISBN 0-201-43303-6 


Michael J. Hammel 



Seattle, WA: SSC, 1999. Pp. 340. 
ISBN 1-57831-011-3 


Brian W. Kernighan & Rob Pike 



Reading, MA: Addison Wesley, 1999. Pp. 250+. 
ISBN 0201-61586-X 



\ 

by Peter H. Salus 

Peter H. Salus is a member 
of ACM, the Early English 
Text Society, and the Trollope 
Society, and is a life member 
of the American Oriental 
Society. He has held no 
regular job in the past 
lustrum. He owns neither a 
dog nor a cat. 

<peter@pedant.com> 


Too many books. Too little time. But 
there are four that really stand out. 
However, before starting, I need to make 
a formal apology. On two occasions I’ve 
taken Addison Wesley to task for allow¬ 
ing Kernighan and Plauger’s monumen¬ 
tal Software Tools to go out of print. Just 
before last Christmas I was told that I 
was mistaken, it was out of stock, but 
reprinting. I now have a pristine copy of 
the “26th Printing, December 1998.” It’s a 
pity that (like Rick Blaine) “I was misin¬ 
formed.” As of the end of January, 
Amazon.com, Powell’s, and Bookfind 
couldn’t locate it, either. 

Hardware History 

Ceruzzi (A History of Modern 
Computing) has done all of us a great 
service. 

This is a fine, solid history of the com¬ 
mercial computer from 1945 to the pres¬ 
ent; from the time that computers were 
people to the time that they are children’s 

u. » 

toys. 

Both the hardware/engineering and the 
human elements are traced, and I was 
pleased to see that (for example) the 
founding of SHARE (in 1955) was 
included, as I consider that the beginning 
of “free software,” or (at least) shareware. 
The only disappointing thing about this 
History is the fact that it slights lan¬ 
guages and operating systems. While the 
first decade or so after the Second World 
War was one of hardware innovation, 
from 1955-57 on, software came to the 
fore. 


Chapter 3 (pp. 79-108) concerns the 
early history of software, but it is full of 
lacunae. These are not errors! I think that 
Ceruzzi decided what to include and 
what not to include. And this is obvious 
in the detailing of the hardware, too. 

In software, most of the OS and language 
work of the last decades is ignored. Ada, 
Smalltalk, Icon, Tel, and Perl are among 
the languages unmentioned. Chorus, 
Mach, and Linux are among the operat¬ 
ing systems. CTSS is mentioned on pp. 
155f., but Corbato’s name never appears. 
We are told that C was derived from B, 
but neither BCPL nor its creator, Martin 
Richards, is to be found. 

At least part of this is understandable: 
Ceruzzi is a curator at the National Air 
and Space Museum, and museums have 
always been worse at collecting things 
that aren’t “hard.” 

Ceruzzi wrote an excellent book on com¬ 
puters from 1935 to 1945, Reckoners: The 
Prehistory of the Digital Computer , some 
years ago. Despite the “sins” of omission, 
Ceruzzi presents an outstanding history 
of much of the hardware and some of 
the software of the past half-century. I 
can’t wait for his next one. 

Security 

Denning ( Information Warfare and 
Security) has produced a first-rate book. 
But I have a confession to make: I was a 
reader of the manuscript for Addison 
Wesley and am thanked in the “Preface.” 
If this makes my review invalid, I’m 
sorry. Furthermore, I am complimenting 
Denning’s work despite the fact that she 
is a strong proponent of key escrow, to 
which I am opposed. 

Denning has produced a volume of great 
importance: not a book about how to set 
up password files or firewalls, but a 
sober, adult discussion of the dangers of 
what she calls “information warfare.” 

Denning starts out by recounting a tale 
of five Dutch crackers who broke into US 
DoD computers in 1991. She then moves 
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on to her view of information warfare, 
going into real detail about offense, 
defense, and the value of such resources 
to the opposing sides. She limns this 
going from playgrounds to battle¬ 
grounds. 

The next sections are fascinating to read 
against Bill Cheswick’s paper at SANE’98 
in the Netherlands. Denning is far more 
sober than Cheswick, but the warfare/ 
battlefield analogies of Denning are com¬ 
pletely analogous to Cheswick’s moats, 
sieges, treachery, etc. 

Because of the nature of information 
warfare, Denning delivers a lengthy sec¬ 
tion on methods of attack: Using open 
sources to spy on individuals, copyright 
infringement, deception, hoaxing, 
defamation, spam wars, insider abuse, 
wiretapping, packet sniffing, telecommu¬ 
nications fraud, and about two dozen 
other things are detailed. 

Finally, Denning looks at defensive mea¬ 
sures, ending up with national policies. 

One of the very best aspects of this book 
is the manner in which the frequent 
anecdotes are employed to increase the 
salience of subjects under discussion. 

At the very end of Information Warfare 
and Security , the author writes, “The con¬ 
nection between encryption policy and 
security is not simple, and may be vastly 
overstated. Security demands much more 
than encryption, and encryption deploy¬ 
ment is affected by factors other than 
government policy.” 

Shes right. 

Image Manipulation 

GIMP is the GNU Image Manipulation 
Program, a “free” version of Adobes 
Photoshop, designed to be used on a 
variety of UNIX and Linux platforms. 
There was an article about GIMP in the 
November 1997 Linux Journal ; now 
there’s a respectable and beautifully illus¬ 
trated book (The Artists' Guide to the 
GIMP) for prospective users. Hammel 


deserves the thanks of all those who 
design and work with computerized 
images. In barely 300 pages, he manages 
to introduce GIMP and elucidate the 
majority of its features, with high-quality 
images to show exactly what the GIMP 
tools and filters do. 

GIMP is another demonstration of the 
bazaar taking over from the cathedral. I 
hope it becomes as widely used as it 
deserves. 

Practice Makes Perfect 

I mentioned Software Tools at the begin¬ 
ning of this column. It is still one of the 
best and most useful books on the “tool” 
philosophy. But The Practice of 
Programming is an absolute must for 
anyone reading ;login:\ I am writing this 
on the basis of a draft manuscript, but 
the book will be out by the time you read 
this. In nine sections (“Style,” 

“Algorithms and Data Structures,” 
“Design and Implementation,” 
“Interfaces,” “Debugging,” “Testing,” 
“Performance,” “Portability,” and 
“Notation”), Kernighan and Pike serve 
up a scintillating volume that covers the 
gamut of skills and problems encoun¬ 
tered by programmers. 

My favorite passage is in Section 5.2: 

Oops! Something is badly wrong. My 
program crashed, or printed nonsense, 
or seems to be running forever. Now 
what? 

Beginners have a tendency to blame 
“the compiler,” the library, or anything 
other than their own code. 

Experienced programmers would love 
to do the same, but they know that, 
realistically, most problems are their 
own fault. 

The book ends with a brief epilogue and 
a three-page appendix: “Collected Rules.” 
If I had my druthers, every high school 
and college student taking programming 
would be compelled to learn them. 
Among the best: 


Be clear. 

Be accurate. 


Keep records. 

Test incrementally. 

Use standard compilers. 

Don’t assume ASCII. 

Don’t assume English. 

Thanks, Brian and Rob. 

Note 

I’ve gotten a heap of books on ATM and 
another pile on Networking/Internet¬ 
working. They take a lot of time to actu¬ 
ally look at. I hope to devote all (or 
most) of the June column to a number of 
these. 



Notice of Annual 
Meeting 

The USENIX Association's Annual 
Meeting with the membership and the 
Board of Directors will be held at the 
Monterey Conference Center, site of the 
1999 USENIX Annual Technical 
Conference. The date, time, and location 
of the conference will be published on 
the USENIX Web site 
<http://www.usenix.org/whatsnew/> in mid- 
May and will also be posted to 
comp.org.usenix at that time. This is a great 
opportunity to get your questions 
answered and to make suggestions on 
how we might serve you better. Everyone 
is welcome! 


April 1999 ;login: 


71 


THE BOOKWORM 





book reviews 


Dan Lynch, James P. Gray, and Edward 
Rabinovitch, editors _ 

SNA and TCP/IP Enterprise Networking 

Manning Publications, 1997. 

Pp. 540. $60.00. ISBN 0-13-127168-7 

Reviewed by Daniel Lazenby 

<dlazenby@ix.netcom.com> 

Most organizations have their own share 
of local-area networks. Multiple networks 
(both local- and wide-area) using a vari¬ 
ety of protocols may have been inherited 
as the result of a consolidation or merger. 
Whatever their origin, these networks 
need to play well together. Making them 
do so requires the network practitioner 
to make choices. These choices must 
maintain a balance among cost, reliabili¬ 
ty, achieving today’s goals, remaining 
compatible with legacy architectures, and 
providing a migration path to yet-to-be- 
defined networking technologies. For 
many organizations, building a network 
over from the ground up is not an 
option. 

SNA and TCP/IP Enterprise Networking , 
offered as a handbook for network prac¬ 
titioners, is based on the fundamental 
premise that multiple protocols will 
coexist forever. The authors feel this book 
will assist the reader with implementing 
reliable, cost-effective networking solu¬ 
tions with a migration path to future 
technologies. 

The book first exposes the reader to a 
brief history of the development of 
multi-protocol communications. A brief 
discussion of the basic philosophical dif¬ 
ferences between SNA and TCP follows. 
The authors also describe important SNA 
multi-protocol integration products and 
popular deployment methods. I he book 
closes with a look at emerging solutions. 
The authors illustrate both the strengths 
and weaknesses of each technology. 


While the book provides a balanced pre¬ 
sentation, its primary view of the world 
is from an SNA perspective. 

The TCP and SNA sections open with a 
set of functionally oriented tutorials. 

They introduce SNA terminology and 
acronyms, and present the what and why 
of a subject. Many of these tutorials walk 
you through the flow of an SNA process. 

A couple of these chapters introduce 
TCP/IP concepts and terminology. Other 
chapters present application program¬ 
ming interfaces (APIs) that may be used 
with either of the two network protocols. 

Chapter 8, the last chapter of part I, sum¬ 
marizes the information presented in the 
first seven chapters. In many places it 
provides a side-by-side functional com¬ 
parison of the two network architectures 
and clearly illustrates the similarities and 
differences in performance between 
them. The authors review the two proto¬ 
cols’ usability and reliability characteris¬ 
tics, and they present approaches to and 
methods for converging the two proto¬ 
cols. 

Following the tutorials is an examination 
of SNA’s interoperability features. The 
book’s second part, “SNA Interoperability 
Today,” describes currently available solu¬ 
tions: multi-protocol routers, gateways, 
and other software. Part II opens with a 
description of today’s popular SNA inter¬ 
networking strategy’s benefits and pit- 
falls. The next chapter examines why one 
might want to encapsulate SNA within 
TCP/IP and provides some alternatives to 
encapsulation. Another chapter presents 
an approach for using SNA as the trans¬ 
port for non-SNA protocols. Topics such 
as providing 3,270 terminals access to 
TCP/IP applications, and questions and 
considerations when selecting or imple¬ 
menting an SNA gateway, are covered in 
the last couple of chapters of part II. 


Host (mainframe) systems and their SNA 
networks have not gone away. Some new 
mainframes are even being sold as “enter¬ 
prise servers.” Part II concludes with 
examining ways Web technology can be 
used to enhance, extend, and leverage 
legacy platforms and the SNA architec¬ 
ture. This platform-independent 
client/server technology is inexpensive, 
easy to use, and readily available for 
many platforms. Web browser technology 
may just turn out to be the great equaliz¬ 
er among platforms. 

The third section of the book, “Emerging 
Solutions,” discusses the impact of recent 
technology on SNA presentation services. 
It suggests that future SNA applications 
will be “Common User Access compliant 
client/server applications written in Java.” 
In closing, the last chapter gazes into the 
protocol crystal ball, presenting one view 
of how distributed computing technolo¬ 
gy may generate the next revolution in 
protocols. 

Overall, the authors offer a model for 
synthesizing a multi-vendor, multi-pro¬ 
tocol network into a cohesive whole that 
can appear to applications and the user 
as a single integrated network. 
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USENIX 

The John Lions Award 

by Dr. Lucy Chubb \ 

Senior Consultant, Softway Pty Ltd.; President, AUUG 
<lucyc@softway.com.au> , 


We are pleased to announce a second 
contribution by USENIX of $6,000 to the 
Lions Award fund. Ted Dolatta and John 
Mashey designated that the funds 
received from the auction of the 
California UNIX license plate be con¬ 
tributed to the fund. 

The John Lions Award for Research Work 
in Open Systems was instituted in 1997 
to honor the leading role Lions played in 
bringing UNIX to Australia, in the for¬ 
mation of AUUG, and in promotion of 
the values held by the open systems com¬ 
munity. 

The winner of the inaugural Lions 
Student Award in 1997 was Jerry 
Vochteloo, a Ph.D. student at the 
University of New South Wales. His work 
involved implementing UNIX-like “rwx” 
file protection on Mungi Objects (Mungi 
is a single-address-space object-oriented 
operating system that is being developed 
at the University of NSW). 

Last year’s winner was Steve Blackburn of 
Australian National University in 
Canberra, for work involving the creation 
of an orthogonally persistent version of 
Java. (IVe read that his work has attract¬ 
ed interest from some of the big comput¬ 
er manufacturers.) 

Over the past few years, IVe heard it 
asked a number of times whether there is 
anything interesting going on in the 
operating system/open systems area. I 
believe there is. It’s just that people aren’t 
seeing it. This award should both encour¬ 
age good work in the area and publicize 
the good work that’s happening. Please 
visit <http://www.auug.org.au/lions/> for more 
information on the John Lions Award. 


news 

New Staff at USENIX 

The small (average under 5'4") but ener¬ 
getic staff at USENIX has welcomed sev¬ 
eral additions lately. We’d like to intro¬ 
duce you to: 

Gale Berkowitz. Before coming to the 
USENIX Association, Gale was on the 
faculty at the University of California, 

San Francisco. While her professional 
training is in public health research and 
epidemiology, she has long been interest¬ 
ed in innovations in technology. In spite 
of her own experience with UNIX in the 
early ’80s (which drove her to purchase 
her first Macintosh six months later) and 
her brother-in-law’s opinion that UNIX 
was a gift from the devil (though both 
admitted that it has improved immeasur¬ 
ably since then), she accepted her new 
position as Deputy Executive Director of 
USENIX without hesitation. As the DED, 
Gale will manage the day-to-day opera¬ 
tions of the Executive Office, finances, 
and the Good Works and student pro¬ 
grams, and will provide support to SAGE. 
She can also sometimes be found doing 
light housework around the office. 

Bleu Castaneda. USENIX Administrative 
Assistant Bleu comes to us with a back¬ 
ground in retail/customer service. She 
says she moved to San Francisco to study 
film production at the Academy of Art, 
but her apparent urge to visit every coffee 
shop in the world makes one wonder ... 

Cami Edwards. Southern California born 
and raised, USENIX Administrative 
Assistant Cami graduated with a B.A. in 
history from Cal State Long Beach. She’s 
been on the Internet for close to five 
years and professes a fascination with all 
things UNIX and geeky. She can often be 
found dancing to techno or house. 

Jane-Ellen Long. JE has combined com¬ 
puters and publishing throughout her 
working life, although her heart remains 
in Victorian England. She came to us 
from the University of California Press, 
where she served as Director of 
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Information Systems. In her younger 
years she worked variously as production 
editor, copy editor, typesetting shop man¬ 
ager, even association conference manag¬ 
er. How old is she? You figure it out: she 
still has a punched card for a PDP-10. As 
Publications Director at USENIX, JE 
manages the Web site and print and 
online production of the conference pro¬ 
ceedings, and serves as Managing Editor 
of ;login 

Jennifer Radtke. Jennifer was drafted to 
work at USENIX by the new Publications 
Director, Jane-Ellen Long, for whom she 
had worked at the University of 
California Press, doing computer support 
and Web design. Her time at USENIX is 
mostly filled with creating and editing 
pages for the USENIX Web site. Her time 
away from USENIX is filled with surfing, 
playing electric violin, and dyeing her 
hair purple. 

20 Years Ago in 
U[SE] NIX 

by Peter H. Salus 

<peter@pedant.com> 

The January 1979 (Santa Monica) meet¬ 
ing had been a success: 350 attendees. 

The June meeting was scheduled for 
Toronto. The planning was underway. 
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First, there was an announcement of the 
“UNIX Users Group Conference” (the 
long arm of the AT&T lawyers hadn’t 
reached to Queens Park or St. Joseph 
Street yet), to be held at the University of 
Toronto, “Wednesday, June 20, 1979 
through Saturday, June 23, 1979,” with 
“Registration: Tuesday evening, June 19, 
1979, 7:00PM - 9:00PM ” 

Second, there was an announcement of 
the “Software Tools User Group 
Meeting,” to be held “June 19, the day 
before the general UNIX users meeting.” 

I’ll write about the meeting(s) in June. 
But in many ways, 1979 was to be an 
important and eventful one for UNIX. 
Toward the end of 1978, an early version 
of 32V - the port to the VAX by Charlie 
Roberts’s group in Holmdel - made its 
way from New Jersey to California. The 
cohort at the CSRG immediately began 
working on turning it into 3BSD. At the 
same time, the group at Bell Labs were 
turning V6 and 32V into V7. 

By the time of its release, V7 was a truly 
wonderful system. Less than two years 
after DEC announced the VAX, here was 
a 32-bit OS (32V) and a new and updat¬ 
ed version of UNIX for the PDP-11. 
Moreover, it contained Steve Bourne’s 
new shell, as well as grep, uucp, awk, lex, 
lint, etc. The V7 manual was also the 
first to be commercially published: I still 
have my copy of the Holt, Rinehart and 
Winston printing of 1979. 

V7 also served as input to 3BSD. In 
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fact, when it appeared, 3BSD was every¬ 
thing that Bill Joy had said he wanted 
Berkeley’s “product” to be. 

Bill had said that he was tired of doing 
releases that were made up of utilities 
and decided he wanted to release systems. 
32V and V7 (which Berkeley had gotten 
in January) were the material he worked 
with. 

3BSD was a complete, bootable system. It 
had a bootblock at the beginning of the 
tape, so you could roll it onto raw hard¬ 
ware. 3BSD had a virtual-memory-based 
kernel, and all the utilities had been boot¬ 
ed across. Perhaps more importantly, if 
you wanted to run UNIX with a paging 
system, you had to run 3BSD. 

One of the people who wanted to run 
UNIX was Brian Harvey. In January 
1979, Brian went to Lincoln-Sudbury 
Regional High School and persuaded the 
school board to float a bond issue for 
computer equipment. Brian’s 15-year- 
olds ran UNIX on a PDP-11. 

Also in early 1979, Jim Kulp at the 
IIASAS in Laxenburg, Austria, bought a 
VAX/780 and ran 3BSD on it. 

Commercialization had also begun. In 
1977 there was Interactive Systems; in 

1978, P.J. Plauger’s Whitesmiths compiler 
and Idris, the first UNIX clone. Now, in 

1979, with UNIX looking forward to its 
10th birthday, there came XENIX - a col¬ 
laboration between Microsoft and the 
Santa Cruz Operation - the first UNIX 
implementation for the Intel 8086. 
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There was going to be a lot to talk about 
in Toronto. 

USENIX Funding Helps 
Improve the US Patent 
Process for Software 

by Cynthia Deno \ 

USENIX Marketing Director 

<cynthia@usenix.org> , 


Invention in software technology, as in 
other fields, builds on previous advances. 
Inventions may be patented, unless they 
can be shown to be duplicative of previ¬ 
ous work. In these years of light-speed 
advance in software development, the US 
Patent and Trademark Office (USPTO) 
has received and granted a rapidly 
increasing number of patents for soft¬ 
ware-related inventions. But some 
patents are better deserved than others. 

Unfortunately, there is significant and 
growing uncertainty and controversy 
about the right to use a variety of soft¬ 
ware-related technologies. Software engi¬ 
neers run the risk every day of inadver¬ 
tently infringing a patent, even though 
the particular technique being used is old 
and familiar. Inventors run the risk of 
having ownership of their contributions 
assigned to others by the USPTO. 

Such controversy and uncertainty trans¬ 
late into a great many dollars for patent 


royalties, patent litigation, and efforts to 
avoid the use of materials patented by 
others. Three major groups are involved, 
the software industry - companies that 
rely on software as well as those that pro¬ 
duce it - the USPTO, and patent profes¬ 
sionals, along with individual inventors. 

The Software Patent Institute was formed 
to help improve the patent process. It 
provides seminars and online access to 
prior art relevant to software-related 
technology. SPI has created a large and 
useful database of source documents - 
conference proceedings, journal articles, 
computer science theses, computer man¬ 
uals, etc. - that are not readily available 
elsewhere. The USPTO is an enthusiastic 
supporter of the not-for-profit SPI, and 
patent office staff are frequent users of 
the database. 

The USENIX Association, as part of its 
“Good Works” Program, recently granted 
the Software Patent Institute $55,000. 
This funding comes at an especially 
important time for SPI: they have a large 
backlog of material to load into their 
database but did not have enough money 
to go forward. This grant, says SPI 
Founder Dr. Bernard A. Galler, will 
enable the continued growth and 
improvement of this increasingly impor¬ 
tant database resource, which benefits the 
software community as a whole. 

The SPI Database of Software 
Technologies is accessible online by the 
public without charge. Along with help¬ 


ing the USPTO issue valid patents in the 
software field, the database helps software 
developers avoid the cost of defending 
against frivolous or otherwise invalid 
patents, and patent applicants can more 
easily and inexpensively research their 
claim. 

The USENIX Association is wholly sup¬ 
portive of SPIs mission to catalogue and 
make accessible source documents and 
software prior art. USENIX also recog¬ 
nizes the enormous task involved in 
achieving a more complete database. 

The main problems encountered by SPI 
in building its database are obtaining 
copyright permissions and then funding 
the work of scanning and converting to 
machine-readable form early publications 
which exist only on paper. SPI staff are 
very efficient: a monospaced dissertation 
in Courier typeface can be put on-line 
pretty quickly. Documents such as the 
ACM Guides, however, with their tiny 
type and heavily abbreviated bibliograph¬ 
ic entries, pose a larger challenge, as do 
documents with a lot of non-textual 
material such as graphs, equations, fig¬ 
ures, or code. Nonetheless, even these 
yield to technology, patience, and time, 
and the SPI database is growing to pro¬ 
vide an ever broader collection of older 
computer science materials. 

For more information about SPI, consult 
<www.spi.org>. 
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5th Conference on Object-Oriented Technologies and Systems 

COOTS '99 

Monday-Friday, May 3-7, 1999 1 

Town & Country Resort Hotel, San Diego, California 

Tutorial Program 

Monday and Tuesday, May 3-4, 1999 

Patterns at Work Patterns and Performance of Real-time Object 

Frank Buschmann, Siemens AG Request Brokers 

Writing Efficient C++ Programs Douglas C. Schmidt, Washington University, St. Louis 

Stan Lippman, Consultant Distributed Java: Building Collaborative Applications 

Implementing CORBA Servers Using the Portable Ron I. Resnick, DiaLogos, Inc. 

Object Adapter Advanced Principles of Object-Oriented Design in UML 

Steve V. noski, IONA Technology Robert Q Mardn> Object Mentor Inc. 

Introduction to Java Beans inn 

Uwe Steinmueller, Siemens Microelectronics JavaBean Components: Specification, Design and Test with 

The C0M(+) Programming Model Catalysis/UML p/ . . 

Don Box, DevelopMentor Desmond D Souza, Platinum Technology 

Programming for the Jini™ Platform i! cal ^® ^BMS Applications 

Ken Arnold, Sun Microsystems, Jini Team att BenDamel, Object Design, Inc. 

Technical Program 

Wednesday, May 5, 1999 

Opening Session 

Opening Remarks & Awards 

Murthy Devarakonda, IBM T.J. Watson Research Center 

Keynote Address 

James Arthur Gosling, Ph.D. 

Chief Scientist Java Software; VP and Fellow, Sun Microsystems, Inc. 

James Gosling is currently a VP & Fellow at Sun Microsystems. He has built satellite data acquisition systems, a multiproces¬ 
sor version of Unix, several compilers, mail systems and window managers. He has also built a WYSIWYG text editor, a con¬ 
straint based drawing editor and a text editor called Emacs' for Unix systems. At Sun his early activity was as lead engineer 
of the NeWS window system. He did the original design of the Java programming language and implemented its original 
compiler and virtual machine. He received a BS in Computer Science from the University of Calgary, Canada in 1977. He 
received a PhD in Computer Science from Carnegie-Mellon University in 1983. His thesis was entitled "The Algebraic 
Manipulation of Constraints". 

Design Patterns 

Chair: Steve Vinoski, IONA Technologies, Inc 

Filters as a Language Support for Design Patterns in Object-Oriented Scripting Languages 

Gustaf Neumann and Uwe Zdun, University of Essen, Germany 

Performance Patterns: Automated Scenario Based ORB Performance Evaluation 

Sridhar Nimmagadda, Chanaka Liyanaarchchi, Douglas Niehaus, Anil Gopinath and Arvind Kaushal, University of Kansas 

Object-Oriented Pattern-Based Parallel Programming with Automatically Generated Frameworks 

Steve MacDonald, Duane Szafron, and Jonathan Schaeffer, University of Alberta, Canada 

Runtime Issues 

Chair: Yi-Min Wang, Microsoft Research 

Intercepting and Instrumenting COM Applications 

Galen C. Hunt, Microsoft Research and Michael L. Scott, University of Rochester 

Implementing Causal Logging Using OrbixWeb Interception 

Chanathip Namprempre, Jeremy Sussman, and Keith Marzullo, University of California, San Diego 

Quality of Service Aware Distributed Object Systems 

Svend Frolund and Jari Koistinen, Hewlett-Packard Laboratories 

Objects and Databases 

Chair: Rajendra Raj, Morgan Stanley & Company 

Resource Control for Java Database Extensions 

Grzegorz Czajkowski, Tobias Mayr, Praveen Seshadri, and Thorsten von Eicken, Cornell University 

Address Translation Strategies in the Texas Persistent Store 

Sheetal V. Kakkad and Paul R. Wilson, University of Texas, Austin 


For detailed tutorial descriptions, please go to: http://www.usenix.org/events/coots99 









